UNCLASSIFIED 

,n  405  925 


DEFENSE  DOCUMENTATION  CENTER 

FOR 

SCIENTIFIC  AND  TECHNICAL  INFORMATION 

CAMERON  STATION.  ALEXANDRIA,  VIRGINIA 


UNCLASSIFIED 


NOTICE:  When  government  or  other  drawings,  speci¬ 
fications  or  other  data  are  used  for  any  purpose 
other  than  in  connection  with  a  definitely  related 
government  procurement  operation,  the  U.  S. 
Government  thereby  incurs  no  responsibility,  nor  any 
obligation  whatsoever;  and  the  fact  that  the  Govern¬ 
ment  may  have  formulated,  furnished,  or  in  any  way 
supplied  the  said  drawings,  specifications,  or  other 
data  is  not  to  be  regarded  by  implication  or  other¬ 
wise  as  in  any  manner  licensing  the  holder  or  any 
other  person  or  corporation,  or  conveying  any  rights 
or  permission  to  manufacture,  use  or  sell  any 
patented  invention  that  may  in  any  way  be  related 
thereto. 


405  925  05925 


45  b 


RADC-TDR-63-136 


14  March  1963 


Final  Report 

STATISTICAL  INSTRUMENTATION  STUDY 


A.  W.  Crooke 
W.  B.  Floyd 
A.  H.  Nuttall 


LITTON  SYSTEMS,  INC. 

Data  Systems  Division 
Communication  Sciences  Laboratory 
221  Crescent  Street 
Waltham  54,  Massachusetts 


Contract  AF30(602)-2663 
Project  4519,  Task  451903 
Electronic  Systems  Division  (RADC) 


Prepared  for 

Rome  Air  Development  Center 
Air  Force  Systems  Command 
United  States  Air  Force 


D  DC 

JrjiEJaEfitifiEnjj 

JUN10  8S3  ill 


Griffiss  Air  Force  Base 
New  York 


Qualified  requestors  may  obtain  copies  of  this  report  from  the  ASTIA  Document 
Service  Center,  Dayton  2,  Ohio.  ASTIA  Services  for  the  Department  of  Defense 
contractors  are  available  through  the  “Field  of  Interest  Register”  on  a  “need-to- 
know”  certified  by  the  cognizant  military  agency  of  their  project  or  contract. 


RADC-TDR  -63- 1 36 


14  March  1963 

Copy  No. 


Final  Report 

Statistical  Instrumentation  Study 

A.  W.  Crooks 
W.  B.  Floyd 
A.  H.  Nuttall 


Litton  Systems,  Inc. 
Communication  Sciences  Laboratory 
Data  Systems  Division 
Waltham  54,  Massachusetts 


Contract  Number  AF30(602)-2663 

Project  4519,  Task  451903 
Electronic  Systems  Division  (R_ADC) 


Prepared  for 

Rome  Air  Development  Center 
Air  Force  Systems  Command 
United  States  Air  Force 
Griftiss  Air  Force  Base 
New  York 


Statistical  Instrumentation  Study 
Contract  AF30(60Z)-2663 


14  March  1963 


Prepared  by 

A.  W.  Crooke 
W.  B.  Floyd 
A.  H.  Nuttall 


Approved  by 

George  Sebestyen  Jojtn  Geardes 

Technical  Director  A&istant  Manager 


LITTON  SYSTEMS,  INC. 
DATA  SYSTEMS  DIVISION 


Foreword 


This  report  has  been  prepared  by  the  Communication  Sciences 
Laboratory  within  the  Data  Systems  Division  of  Litton  Systems#  Inc.  ,  a 
division  of  Litton  Industries.  The  work  reported  here  has  been  performed 
over  a  period  of  12  months#  under  Contract  Number  AF30(602)-2663,  as 
Task  Number  451903  of  Project  Number  4519#  entitled  "Statistical  Instru¬ 
mentation  Study".  This  project  has  been  completed  under  the  direction  of 
the  Communications  Directorate  within  the  Rome  Air  Development  Center. 


Several  individuals  within  the  Communication  Sciences  Laboratory 
have  made  major  contributions  to  the  study  and  development  of  high  order 
statistical  estimation  techniques  reported  here.  The  Probability  Analyzer 
breadboard  device  has  been  designed  and  constructed  under  the  guidance  of 
Mr.  Arthur  Crooke;  most  of  the  theoretical  studies  have  been  conducted  by 
Mr.  Thomas  Crystal,  Mr.  William  Floyd  and  Dr.  Albert  Nuttall;  and  Dr, 
George  Sebestyen  has  provided  technical  guidance  for  the  entire  program. 


ABSTRACT 


Several  techniques  for  estimating  n-th  order  statistics  of  signals 
are  investigated,  including  curve  fitting  methods  involving  estimation  of 
average  values  of  functions  of  signal  amplitudes,  and  success  counting 
methods  for  which  probabilities  are  estimated  as  the  percentage  time 
that  a  specified  condition  exists. 


One  of  the  success  counting  methods  is  selected  for  implementation, 
and  a  breadboard  model  constructed.  This  device  will  calculate  fourth  (and 
lower)  order  Joint  and  conditional  probability  density  functions  and  distribution 
functions  for  signals  with  bandv/idth  less  than  10  Kcps.  Provisions  are  incor¬ 
porated  in  the  breadboard  for  calculating  probability  of  any  fourth  order  event 
in  signal  space,  through  simple  and  inexpensive  modification  of  one  unit  in  the 
device. 


TABLE  OF  CONTENTS 


Page  No, 

1  INTRODUCTION  1 

2  METHODS  OF  ESTIMATING  N-TH  ORDER  STATISTICS 

OF  SIGNALS  2 

2.  1  Statistical  Description  of  Signals  2 

2. 1.  1  FirBt  Order  Descriptions  2 

2.1.2  Second  Order  Descriptions  2 

2.  1.  3  Higher  Order  Descriptions  3 

2.1.4  Stationarity  4 

2.2  Methods  of  Estimating  N-th  Order  Statistics  4 

2.  2.  1  Moment  Estimation  Methods  5 

2.2. 1.  1  Power  Series  Approximation  6 

2.2.  1.2  General  Series  Approximation  13 

2.  2.  1.  3  Methods  of  Estimating  Averages  of 

F  unctions  of  Signals  2 1 

2.2.2  Success  Counting  Methods  27 

2.  2.  2.  1  Parallel  Processing  with  a  Digital 

Computer  31 

2.  2.  2.  2  Serial  Processing  with  a  Self 

Contained  Device  3  5 

2.  2.  2.  3  Establishment  of  a  Success  Region  37 

2.  2.  2.  4  Sampling  Rate  Adjustment  40 

2.3  Accuracy  Attainable  With  the  Suecesu  Counting  Method  41 

2.3.1  Quantization  Error  41 

2.3.2  Error  Caused  by  Finite  Processing  Time  42 

2.  3.  3  Errors  Caused  by  Equipment  Inaccuracies  46 

2.3.4  Effects  of  Non-Stationarity  of  Signals  49 

3  A  FOURTH  ORDER  SUCCESS  COUNTING  PROBABILITY 

ANALYZER  52 

3.1  General  Description  5  2 

3.2  Detailed  Description  of  Equipment  55 

-iv- 


TABLE  OF  CONTENTS  (cont.) 


Page  No, 

3.  2.  1  Basic  Units  of  the  Analyser  35 

3.2.2  Success  Counter  55 

3.  2.  3  Arithmetic  Unit  58 

3.  2.  4  Cell  Location  Counter  58 

3.  2.  5  Punch  Logic  and  Data  Format  59 

3.  2.  6  Use  of  Paper  Tape  Output  61 

3.  3  Calibration  Data  62 

3,  3.  I  D.  C.  Calibration  Data  62 

3.  3.2  Low  Frequency  Sinewave  Calibration  62 

3.  3.  3  High  Frequency  Calibration  Using  a  10  Kc  6? 

Sinewave 

3.3.  4  Noise  Measurement  72 

3.4  Expansion  Capabilities  of  the  Breadboard  72 

3.  4.  1  Modifications  of  the  Analog  Circuits  72 

3.  4.  2  Use  of  a  Digital  Comparator  75 

4  CONCLUSIONS  AND  RECOMMENDATIONS  77 

List  of  References  7g 

Appendix  I  79 


-  V- 


UST  OF  ILLUSTRATIONS 


Figure  No. 

Page  No. 

1 

Block  Diagram  of  a  Fourth  Order  Moment 

Estimation  Device 

11 

2 

Block  Diagram  for  Estimating  Coefficients  in  an 
Orthonormal  Function  Series  Approximation  to 

16 

3 

A  Device  for  Displaying  p  j(x) 

19 

4 

A  Device  fo^ Displaying  Two-Dimensional  Cross 
Sections  of  p  (x) 

20 

5 

Parallel  Success  Counting  Probability  Analyser 

32 

6 

Block  Diagram  of  Serial  Success  Counting  Probability 
Analyzer 

36 

7 

Circuit  for  Registering  s  R 

38 

8 

Probability  Distribution  Function  Error  Introduced 
by  Quantization 

43 

9 

Success  Waveform  for  a  Sinewave  Input 

47 

10 

Probability  Analyzer 

53 

11 

Block  Diagram  of  Probability  Analyzer 

54 

12 

Simplified  Diagram  of  Success  Indicator 

56 

13 

Graphical  Calculation  of  Sine  Wave  Density  Function 

60 

14 

Simplified  Diagram  of  Cbmpar&tor  Circuit 

66 

15 

Low  Frequency  Sinewave  Calibration  of  the  Four 
Channels 

71 

16 

Histogram  for  Gaussian  Noise  Waveform 

74 

-  vi- 


LIST  OF  TABLES 


Tafrle  No.  Page  No, 


1  Ratios  of  Volumes  of  Polytopes  and  Hyper-  39 

spheres  for  n  *  2»  n*  3,  and  Several  Values 

of  m 

2  Calibration  of  Cell  Location  63 

3  D.  C.  Calibration  of  Cell  Size  64 

4  Low  Frequency  Calibration  of  Schmitt  Trigger  65 

Hysteresis 

5  Low  Frequency  Calibration  of  the  Four  Channels  68 

by  Measuring  the  Duty  Cycle  of  the  Success  Wave¬ 
form  f  (s)  for  a  100  cps  Sine  Wave 

6  High  Frequency  Calibration  of  the  Four  Channels  69 

by  Measuring  the  Duty  Cycle  of  the  Success  Wave¬ 
form  for  a  10  KC  Sine  Wave 

7  Calibration  of  the  Four  Channels  using  a  10  KC  70 

Sine  Wave 

8  Calibration  of  the  Four  Channels  using  5  KC  Band  73 

Limited  Noise 


-vii- 


1.  INTRODUCTION 


Problems  in  which  high-order  statistical  characterizations  of 
signals  are  needed  arise  in  many  contexts,  but  the  difficulties  of  measuring 
and  presenting  appropriate  data  for  such  characterizations  are  often  effective 
barriers  to  their  use.  The  absence  of  a  device  which  would  provide  a 
complete  characterization  of  second  order  statistics  of  signals,  for  instance, 
has  led  to  the  study  of  correlation  between  such  quantities  as  estimates  of 
target  range  and  azimuth  estimates  provided  at  the  output  of  a  radar.  But 
the  correlation  between  two  statistically  fluctuating  quantities  may  be  not 
only  relatively  uninformative  for  characterizing  their  joint  behavior,  but 
also  misleading  if  too  much  reliance  is  placed  on  the  results  obtained. 


Moreover,  for  some  computations  there  is  no  short  cut  for  bypassing 
calculation  of  the  joint  n-th  order  statistics  of  a  set  of  signals  for  n  fairly 
large.  Determination  of  the  boundary  in  any  multi-dimensional  space  which 
results  from  a  threshold  applied  to  the  likelihood  ratio*  is  an  example. 


A  variety  of  techniques  for  estimating  first  order  statistics  of  signals** 
have  been  implemented  over  the  years,  and  in  at  least  one  case  ,  a  device 
has  been  built  to  estimate  up  to  third  order  statistics.  In  Section  2,  several 
of  these  techniques  are  described  and  evaluated  for  audio  frequency  bandwidth 
signals.  One  of  these  techniques  has  been  selected  for  implementation,  and  a 
breadboard  has  been  constructed.  The  capabilities  and  details  of  construction 
of  this  device  are  described  in  Section  3. 


Conclusions  regarding  the  utility  of  the  techniques  studied,  and 
recommendations  for  utilizing  the  capabilities  of  the  device  constructed,  are 
presented  in  Section  4, 


** 


♦♦♦ 


[7] 

Including  those  described  in  [3],  [8],  and  [11]. 
[5] 
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2.  METHODS  OF  ESTIMATING  N-TH  ORDER  STATISTICS  OF  SIGNALS 


2.1  STATISTICAL  DESCRIPTION  OF  SIGNALS 

Due  either  to  a  lack  of  knowledge*,  inability  to  discover  the  basic 
mechanism,  or  the  genuinely  random  character  of  a  signal  or  process#  an 
exact  deterministic  description  of  a  future  event  from  past  history  is  often 
impossible.  For  example,  it  is  virtually  impossible  to  predict  the  outcome 
of  a  thoroughly  shaken  die;  yet  theoretically,  given  the  initial  position  of  the 
die  and  the  movements  of  the  shaking  element,  the  next  outcome  of  the  throw 
is  determined.  However,  since  the  determination  of  the  next  outcome  would 
involve  an  exhorbitant  amount  of  detailed  computations,  we  often  choose  to 
say  that  the  outcome  is  random  with  the  probability  of  any  one  particular  face 
being  l/6.  Furthermore,  we  say  that  each  outcome  is  independent  of  previous 
results.  In  this  manner,  we  obtain  a  complete  (albeit  approximate)  statistical 
description  of  the  random  process  called  die  throwing.  Whether  or  not  this  is 
an  adequate  description  depends  on  the  symmetry  of  a  particular  die,  and  the 
amount  of  shaking  before  throwing. 


Similarly,  with  voltages  which  vary  as  functions  of  time,  when  there 
appears  (through  a  limited  investigation)  to  be  no  underlying  deterministic  be¬ 
havior,  we  characterize  such  processes  by  probabilistic  statements.  It  is 
possible  and  customary  to  define  a  hierarchy  of  probabilistic  rules,  each  of 
which  is  more  general  than  the  previous  one,  and  the  limit  of  which  is  defined 
as  a  complete  statistical  description  of  the  process. 


2.  1.  1  First  Order  Descriptors 

The  first  rule  is  the  probability  that  at  a  time  t,  a  voltage  value  sft)  will  be 
less  than  or  equal  to  a  given  value  x,  This  function,  denoted  Pj(*  t),  is  called 
the  first  order  probability  distribution,  and  is  the  most  general  first-order 
statistic  there  is.  From  this  quantity  may  be  found  other  more  simple  first 
order  statistics  such  as  the  mean,  variance,  or  v-th  moment  of  the  voltage, 
all  at  time  t.  Entirely  equivalent  to  is  the  first  order  probability  density 
function  (p.  d.  f,  ),  which  is  the  derivative  with  respect  to  x  of  Pj,  and  the  first 
order  characteristic  function,  which  is  the  Fourier  transform  of  the  p.  d.  f. 


2,1,2  Second  Order  Descriptors 

The  second  rule  is  the  joint  probability  that  at  time  t ^ ,  the  voltage  is 
less  than  Xj,  while  at  time  t^,  the  voltage  is  less  than  x^.  This  function, 
P0(xj,  t^l  x^,  t^),  is  the  second  order  probability  distribution,  and  is  quite  an 

information  bearing  quantity.  For  example,  by  letting  x£  equal  infinity,  we 
realize  the  first  order  distribution,  but  in  addition,  from  we  may  calculate 


2  2 

various  cross  moments  such  as  E[  s(t ^)s(t^) ]  or  E[s  (t^)s  The  first  of 

these  two  averages  is  the  correlation  function  of  the  process  {x(t)}  and  has 
proved  to  be  very  useful  in  filtering  and  prediction.  *  For  instance,  if  the 
correlation  function  depends  only  on  t^-t^,  the  Fourier  transform  of  the  cor¬ 
relation  function  indicates  directly  where  in  frequency  the  pawer  of  the  process 
is  located.  The  latter  function  is,  of  course,  the  power  density  function.  Notice 
that  although  gives  the  correlation  function  and/or  the  power  density  spectrum, 
cannot  in  general  be  found  from  these  latter  quantities.  That  is,  P^  is  a  much 
more  general  second  order  descriptor.  Again,  by  differentiation  or  Fourier 
transformation,  respectively,  two  equivalent  descriptors  obtained  are  the  second 
order  p.d.  f.  and  characteristic  function. 


2.  1.  3  Higher  Order  Descriptors 

The  first  and  second  order  descriptors  can  be  extended  to  the  n-th 
order,  where  one  asks  for  the  probability  that  at  time  t  ,  the  voltage  is  less 
than  x^,  k  *  1,  2,  .  .  .  ,  n.  This  is  the  n-th  order  probability  distribution: 

P  (x.,  t x  ,  t  )  =  P  (x,  t).  By  letting  some  of  the  { x,  }  equal  infinity, 
nil  nnn~~  k 

lower  order  distributions  are  obtained.  The  larger  n  is  made,  the  more 

complete  becomes  the  description  of  the  random  process. 


In  many  problems  one  is  interested  in  describing  the  statistical 
behavior  of  several  processes  or  signals  which  are  defined  in  terms  of  a 
common  parameter.  (Usually,  time  is  the  parameter.)  For  instance,  it  may 
be  desirable  to  know  the  joint  (second  order)  probability  that  voltage  Sj  is  less 
than  xj  at  time  t^  and  voltage  is  less  than  x£  at  time  t £.  In  general,  the 

function  P^(£,  £)  can  be  interpreted  as  a  joint  probability  distribution  of  n 

signals  each  of  which  may  be  observed  at  an  independently  specified  time. 

By  considering  the  n  signals  to  be  delayed  versions  of  a  single  waveform,  the 
n-th  order  probability  distribution  of  the 'single  waveform  is  seen  to  be  a  special 
case  of  the  more  general  interpretation  of  P^(£,  £)• 


In  summary,  an  n-th  order  statistical  description  of  signals  it  provided 
by  the  probability  that  at  time  t^  voltage  is  less  than  k  *  1,  2, .  .  .  ,  n, 

designated  P^(x,  t).  The  voltages  {s^)i  values  {  x^}  ,  and  times  { t^)  ,  may 

or  may  not  all  be  distinct. 

~  *See  [  9  It  for  instance. 
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2J*  4  Stationarity 


When  the  n-th  order  probability  distribution  is  a  function  only  ©f  time 
differences,  and  not  absolute  time,  for  all  n,  the  process  is  called  strict  sense 
stationary.  Such  processes  are  often  met  in  practice  and  possess  the  helpful 
feature  of  remaining  n statistically  constant”  as  time  progresses.  Thus,  measure¬ 
ments  at  one  time  are  as  good  as,  and  equivalent  to,  those  made  at  another  time. 
Naturally  this  property  can  be,  and  has  been,  utilized  to  simplify  data  collection 
and  processing. 


When  the  process  is  not  necessarily  strict  sense  stationary,  but 
possesses  a  mean  which  is  independent  of  time,  and  a  correlation  function 
which  is  dependent  only  on  time  differences,  then  the  process  is  called  wide- 
sense  stationary.  The  effects  of  nonstationarity  or  estimation  accuracy  are 
discussed  in  Section  2.  3. 

2.  2  METHODS  OF  ESTIMATING  N-TH  ORDER  STATISTICS 

In  this  section,  several  methods  will  be  discussed  by  which  higher  order 
statistical  signal  descriptors  can  be  estimated.  Before  going  into  any  details, 
it  is  appropriate  to  point  out  an  important  qualitative  aspect  of  the  problem  of 
estimating  statistical  characteristics  of  signals.  If  a  reasonably  large  amount 
of  time  is  available  for  processing  the  signals,  then  questions  of  optimal  efficiency 
of  estimation  methods  in  making  use  of  a  given  sample  size  are  not  crucial.  Only 
if  the  time  required  to  obtain  a  large  amount  of  useful  history  of  the  signals  is 
high,  de  these  questions  deserve  close  scrutiny.  With  this  in  mind,  we  have 
considered  the  selection  of  estimation  methods  to  be  based  primarily  on  (a)  the 
versatility  of  the  technique,  and  (b)  simplicity  and  cost  of  construction  and 
operation  of  equipment  which  is  required  to  implement  any  method. 


There  are  essentially  two  basic  approaches  to  the  problem  of  estimating 
an  n-th  order  distribution  function  F  (x,  t).  The  first  approach  involves  the  measure¬ 
ment  of  average  values  of  quantities  which  are  related  to  the  signal  amplitudes,  and 
will  be  called  the  moment  estimation  approach.  The  second  approach  involves  cal¬ 
culation  of  the  percentage  of  time  that  the  signal  amplitudes  satisfy  a  specified  set 
of  conditions,  and  will  be  called  the  success  counting  approach.  Each  of  several 
ways  of  exploiting  these  two  approaches  to  the  problem  of  estimating  £) 

will  be  described  in  this  section,  indicating  the  versatility,  relative  accuracy 
and  simplicity  of  equipment  associated  with  the  technique. 


The  basic  restrictions  on  signal  sets  which  we  shall  assume  for  the 
techniques  to  be  considered  are  that  (a)  the  amplitude  of  each  signal  is  bounded 
(the  j-th  amplitude,  s  ,  lies  in  the  range  L  <  s  .  <  L  ;  R.  *  L  .  -  L  )  and 
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(b)  the  bandwidth,  W^, 


of  each  signal  is  lees  than  a  specified  quantity  W. 


2.2,1  Moment  Estimation  Methods 


It  is  well  known  that  the  n-th  order  probability  density  function,  * 
p  (x),  can  be  obtained  from  a  knowledge  of  **11  moments  of  the  signal  amplitudes, 
i,  e.  ,  the  average  values  of  the  products  of  powers  of  the  signal  amplitudes. 
Specifically,  p  (x)  is  the  Fourier  transform  of  the  characteristic  function, 

F  (|  j,  £2#  .  ,  . ,  4n)  *  and  °an  be  exPreBSed  in  terms  °*  th® 

moments  by  a  power  series: 


k 


m 


where 


All  f ’<  V}  such  that 


n 

k  *  k 
m 

m*  1 


(i) 


k 

x  m 
m 


*  one  of  the 


n+k-i 

k 


n 


different  moments  of  degree 


k  *  k. 
m 


m*  1 


This  characterization  is  possible  for  many  processes  with  finite  moments.  Thus, 
conceivably  a  procedure  for  estimating  the  moments  of  a  process  could  be  set 
up,  and  the  probability  density  function  could  be  obtained  by  transforming  the 
resulting  estimate  of  the  characteristic  function  for  the  process.  In  practice, 
however,  only  a  finite  number  of  moments  can  be  estimated.  The  question  then 
arises:  how  is  p  (x)  related  (even  approximately)  to  an  incomplete  power  series 
representation  o^F^(^)?  Unfortunately,  except  in  cases  for  which  the  function 

is  already  known  (notably  the  Gaussian  p.  dwi. ),  the  answer  to  this  question  is  in 


♦For  convenience,  we  shall  write  Pn(&)  anc*  Pn(&)  *or  Pn(&»  anc* 
respectively,  wherever  it  is  not  necessary  to  specify  explicitly  the  times, 
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general  unknown.  Therefore!  if  moments  of  a  process  are  to  serve  as  the 
initial  information  bearing  quantities  from  which  an  estimate  of  Pn(x,)  is  to 
be  derived,  some  other  rationale  for  utilizing  a  finite  number  of  moments 
must  be  developed. 


2.  2.  1.1  Power  Series  Approximation 

One  method  of  utilizing  estimates  of  a  finite  number  of  moments  is  to 
attempt  to  approximate  pn(x)  with  a  polynomial  of  the  form 


A  , 
q  nj 


i,  L 


n 


j*o  k*  1 


(2) 


v  ln+hl\ 

where  /  i  *  j  for  all  values  of  j  and  k,  and  k  takes  on  A  ,  *  I  .  I  values, 

Lj  m  nJ  1  i  J 

m*l 

each  of  which  corresponds  to  some  ordering  of  the  possible  values  for  the 

n-tuples  {  i1#  i  # ,  . . ,  i  }  ,  and  q  is  called  the  degree  of  the  polynomial.  With 
i  w  n 

this  approach,  there  are  two  parts  in  the  problem  of  estimating  a  probability 
density  function: 

1}  Finding  the  coefficients!  {a  }  ,  which  provide  a  good  fit  of  a 

q-th  degree  polynomial  to  the  true  probability  density  function,  and 

2)  Obtaining  estimates  °f  these  coefficients. 

The  result  is  an  estimate#  p  (x),  of  the  probability  density  function: 

nq  ~ 


(x) 
*nq  w 


i'f 

j*o  k»l 


*4k  X1 


n 


(3) 


To  solve  the  first  part  of  this  problem,  we  need  a  device  for  measuring 

the  error  in  some  sense  between  the  polynomial  p  (x )  and  p  (x).  One  such  measure 

nq  ~  n  ~ 

is  the  integrated  square  of  the  difference  between  Pn(x,)  and  p^xj,  where  the 
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averaging  is  carried  out  with  uniform  jveighting  over  the  region  for  which  p  (x)  is 
non- zero.  Denoting  this  quantity  by  c  we  have  n  ^ 


2 

s 

4 


I  {pn(*)*Pnq 
X 


(4) 


where  X  denotes  all  values  of  x  for  which  p_(x)  Is  non  zero.  If  it  is  desired 
that  Pnv£)  be  approximated  with  high  percentage  accuracy  over  all  of  the  region 
X,  then  perhaps  a  better  criterion  is  the  integrated  square  of  the  relative  error 
between  pn(x)  and  Pnr(^)*  This  quantity,  denoted  by  t  ^  ,  can  be  written 


2 

t 

qr 


f 


1  - 


p»w 


dx . 


(5) 


While  there  are  other  means  of  determining  goodness  of  fit  we  shall  limit  our 
discussion  to  these  two. 


No  matter  which  method  is  employed,  the  general  procedure  for  deter¬ 
mining  the  form  of  the  {a  }  is  the  same.  To  illustrate  this  procedure,  it  will 

Jk  2 

be  carried  out  in  detail  using  the  measure  c  .  Here,  the  problem  is  to  choose 

2  ^ 
the  {a.,  }  such  that  t  is  minimized,  where 

jk  q 


nj 


x 


ajk  xi 


j«o 


k*l 


dx 


(6) 


( Weierstraas’  theorem  tells  us  that  we  can  choose  the  {a  }  to  make  lim  *  *  0 

^  q«*oo  ^ 

if  the  region  over  which  p  (x)  is  non-zero  is  closed,  and  if  p  (x )  is 

n  n  ~ 

reasonably  well-behaved.) 
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Setting  the  first  derivative  of  c  with  respect  to  a  (for  which  i.  ■  m.f 

q  rm  j  J 

j  ■  1,  2, ...  i  n)  equal  to  zero,  for  all  r  *  0,  1 ,  .  .  ,  ,  q,  and  m  *  1,  2, ,  . .  A^, 

V  |n+q\ 

provides  ^  ^  J  equations  in  N  unknowns: 

j*o 


A  i 

q  nj 


L  Z  C^m)  * d~*  m " l* 2 . N~' 


jk  jk  m’ 


qn 


j«o  k«l 


(7) 


where 


.  .  n  m.+  K. 

°/”M  *, 1  1  - 


m.+  k,  m  +  k„  m  +  k 

2  2  n  n 

_  ... x  dx 

2  n  ~ 


p  m.  m  m 

d  b  \  x,  x,  ...  x  n  p  (x)  dx 
m  J  1  2  n  rn  ~  ~ 


Or,  rewriting  these  equations  in  matrix  notation, 

C  a  -  d 


(8) 


C  K 


M) 

Goi 

r(l) 

C11 

r(l) 

C12  •’* 

rIi)  r(i) 

°ai  °22 

c<‘> 

q, 

C»> 

C01 

c<2> 

11 

c(2) 

C12 

cl2« 

O 

KB 

a 

KB 

c(2,a 

q.  A, 

(N  J 

r  qn  r 
C01  c 

rn)  : 

qn  . 

11 

d*  : 

C  qn 

q 

qn 


qn 


where 


and  d  = 


To  solve  for  the  (atl  }  it  is  sufficient  to  invert  the  matrix  C  ,  to  obtain 


a  *  C  d 


Thus,  if  p  (x^)  is  known,  a  q-th  degree  polynomial  can  best  be  fitted  to  pn(xj 

(in  the  sense  of  minimizing  integrated  squared  error)  by  solving  (9).  However, 
we  do  not  know  pn(xj,  and  in  fact  are  concerned  with  the  problem  of  estimating 
this  function.  This  brings  up  the  second  part  of  the  problem  which  is  the  esti¬ 
mation  of  the  vector  a . 


Bach  of  the  {x.j  is  assumed  to  lie  in  a  finite  restricted  range,  ar*d 

the  elements  of  C  are  independent  of  p  (x )  and  can  be  calculated: 

~  n 


-  0- 


(10) 


n 


v  *  1 


m  +k  +1 
V'  v 

"m  v 


m  +  k  +1 
V  V 

"s  V 


m  +  k  +1 
V  V 


The  elements  of  d^  are  moments  of  the  amplitudes  of  the  signals: 


(ii) 


To  estimate  the  {aM  } 

ik 

and  use  J 


a  it  is  only  necessary  to  estimate  the  {d  }, 
~  m 


A 

a 


A 

d 


(12) 


Thus,  to  obtain  an  estimate  of  p  (x)  using  this  method  requires  that  a  device  for 
estimating  all  q-th  and  lower  or3er  moments  of  the  signal  amplitudes  be  developed, 

and  the  inversion  of  an  x  |  matr^x-  For  n  >  1,  the  latter  operation 


will  require  the  use  of  a  general  purpose  computer.  The  main  storage  capacity 
of  most  computers  would  serve  to  preclude  this  calculation  for  values  of  n  and  q 
greater  than  4. 


The  block  diagram  of  a  device  which  would  provide  estimates  of  moments 
of  stationary  signals  is  shown  in  Figure  1.  Each  moment  is  estimated  by  a  time 
average  over  a  finite  interval,  T,  of  the  product  of  the  powers  of  the  {s.}  involved.* 
Specifically, 


(13> 


As  indicated  in  Figure  1,  a  programmer  could  be  incorporated  in  the  moment 
estimating  device  to  automatically  step  the  powers  of  the  {  s.}  through  all  values 
le9s  than  a  specified  degree,  q. 

■  .  wy  '  -  ■  "  *  —  *"  . . .  " 

This  method  of  estimating  average  values  of  quantities  is  shown  in  Section 
2.  2.  1.  3  to  be  near-optimum. 
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Figure  1.  Block  Diagram  of  a  Fourth  Order  Moment  Estimation  Device 


In  reviewing  the  power  series  approximation  method  of  estimating 
p^(x)  (or  P^x^)),  there  appear  to  be  three  major  objections  to  its  use  for 


n  >  1*  The  first  (and  perhaps  most  important)  objection  is  that  preliminary 
or  partial  results  cannot  be  obtained;  i.  e. *  no  part  of  the  probability  density 
or  distribution  functions  can  be  examined  without  going  through  a  complete 
set  of  possibly  expensive  and  time  consuming  calculations  for  p^  (^). 


The  second  objection  is  that  for  values  of  n  and  q  greater  than  one*  a 
computer  must  be  utilized  to  invert  a  matrix  to  obtain  the  desired  coefficients. 
This  feature  degrades  the  potential  utility  of  a  probability  density  estimation 
device  for  performing  a  quick  analysis  of  signals*  unless  a  general  purpose 
computer  is  immediately  available  to  the  user. 


The  other  major  objection  to  this  technique  is  that  the  construction  of 
an  accurate  power  device  and  multiplier  with  wide  dynamic  range  is  a  costly 
undertaking.  In  addition  to  the  method  outlined  above,  there  are  a  variety  of 
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techniques  for  designing  these  devices,  but  as  the  order  n  becomes  larger#  the 
accuracy  requirements  become  so  stringent  that  the  equipment  feasibility  of 
calculating  higher  order  moments  becomes  doubtful. 


There  are  several  other  factors  which  bear  upon  the  accuracy  attainable 
with  this  approach  to  the  problem  (for  instance,  the  effect  of  the  limited  processing 
time#  T)  some  of  which  are  common  to  all  methods  of  estimation.  An  examination 
of  moment  estimation  accuracy  is  reported  in  subsection  2.  2.  1.  3. 


If  another  method  of  measuring  closeness  of  approjdmation  is  used  in 
the  power  series  method,  then  similar,  but  possibly  less  convenient  results 
are  obtained.  For  instance,  by  minimizing  the  integrated  relative  squared 
2 

error,  e 
but  with 


,  instead  of  c  ,  the  solution  for  the  {a  j  is  again  g^ven  by  (9), 
q  Jk 


c<?  = 
j* 


I 


mi+ki 


m2+k2 


[  P  (x)]‘ 


m  +  k 
n  n 
x 
n 


dx 


and 


d 

m 


I 


m 

n 

x 

n 


dx  . 


(14) 


d  }  can 
m 

be  estimated  when  p  (xj  is  unknown.  Therefore,  in  addition  to  the  lengthy  matrix 
calculations  and  ajmSersame  operations  apsociated  with  an  attempt  to  approximate 
pn(x,)  with  a  power  series#  one  must  be  content  to  attempt  to  minimize  a  particular 
measure  of  closeness  of  approximation  to  p  (x). 


/  \ 

Apparently,  there  is  no  direct  way  in  which  either  the  {  C.  }  or  the  { 

JK 


2.2.  1.2  General  Series  Approximation 

A  natural  generalization  of  the  power  series  approximation  is  the 
representation  of  Pn(j£,)  with  a  series  of  the  form 
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p  lx 

n  ~ 


(x)  = 


P  x>  )  s 
nM  ~ 


l 


a 

m 


f  (x) 
m  ^ 


(15) 


where  the  {f  (x)}  are  known  functions  defined  over  the  region  X.  Following 
m  ^ 

the  same  procedure  as  with  the  power  series  approxirm  tion,  we  may  choose 

the  coefficients  {a  }  so  that  the  integrated  squared  error 
m 

■  I  '  p„M(SP2  dS-  ,161 

X 


is  minimized,  These  coefficients  are  determined  by 


C‘l  d 


(17) 


where 


and 


C  *  \  f  ,{x)  f  (x )  dx 
jk  J  jw  k  ~  ~ 

dj  £  |  fj(2L>  Pn(2£)  d* 


for  j,  k  *  1 ,  2, .  .  ,  ,  M. 


By  choosing  the  {f  (xj}  to  be  orthonormal  over  X,  this  solution  is  simplified  to 


a  *  f  (x)  ,  (iS) 

m  m  ^ 


and  the  problem  of  estimating  p^(xj  w^th  Pnj^(2L)  evolves  into  that  of  estimating 

with  a  •  Then 
m 

M 


PnM 


(x) 
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L, 


/v 
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f  (x) 


m  ~ 


(19) 
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While  the  matrix  inversion  problem  involved  in  the  power  series  approximation 
has  been  obviated,  this  approach  still  requires  the  construction  of  nonlinear 

devices  for  determining  p  wlx). 

nM  ~ 


One  way  to  implement  this  technique  is  to  choose  the 
separable  functions  of  x;  i.  e,  , 


{f  (x)} 
m 


to  be 


f 

m 


(x)  = 


(20) 


where  a  unique  index  m  is  assigned  to  each  n-tuple  (m^  .  •  •  >  m^).  If  the 

first  order  density  function  of  each  variate  is  to  be  approximated  by  q  terms  of 
a  series  expansion  in  the  g-functions,  then  an  estimate  of  the  n-th  order  process 
can  include  up  to  M  =  qn  terms.  Estimates  of  the  coefficients  are  now  provided 
by 


A 

a 

m 


n 

4  I  If  'mj’k1'11  dt 

T  k=l 


(21) 
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The  estimates  {a  )  can  be  obtained  by  constructing  a  device  as 
m 

indicated  in  Figure  2.  As  indicated,  if  all  q  terms  are  to  be  included  in 
the  approximation, 
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1  6  ■ 


then  (nq)  different  g-function  devices  are  required,  since  the  series  includes 
n 

terms  of  the  form  1 1  g^  (x^).  ^  i®  conceivable  that  less  than  q11  terms  could 

k*  1  k 

be  used  for  the  approximation  to  reduce  equipment  construction  requirements. 
However,  this  would  definitely  degrade  severely  the  accuracy  with  which  some 
probability  density  functions  could  be  estimated. 


The  major  objections  to  the  use  of  this  method  of  estimation  are  (a)  the 
requirement  to  perform  all  of  the  calculations  involved  in  an  estimate  of  the 
entire  probability  density  function  before  any  results  can  be  obtained,  (b)  the 
high  accuracy  requirements  in  constructing  the  g-function  devices,  and^c)  the 
large  number  of  such  devices  required  for  accurately  estimating  higher  order 
functions.  The  absence  of  a  matrix  inversion  calculation  makes  this  method 
much  more  attractive  than  the  power  series  approximation. 


We  have  considered  the  above  objections  to  the  orthonormal  function 

series  approximation  method  to  be  sufficient  to  remove  it  from  consideration  for 
implementation  on  this  project.  However,  for  display  purposes  this  method  does 
possess  a  novel  feature  which  merits  description. 


Consider  for  the  moment  the  problem  of  displaying  the  first  order  pro¬ 
bability  density  function,  p^(x).  If  q  *  M  linear  filters  are  constructed  so  that 

the  k-th  filter  has  an  impulse  response,  g^(t)»  then  the  estimate  p^(t)  can  be 

obtained  as  a  function  of  time  by  summing  the  weighted  outputs  of  these  filters 
when  their  common  input  is  an  impulse.  As  indicated  in  Figure  3f  the  weighting 

coefficients  are  the  {a^}.  Thus,  f^(t)  is  computed  by 
m 

Pj(t)  ■  ak  j*  6(t-r>  gk(T)  dT 

k*l 

(22) 

,  |<gk[«(t)]>T 

k«l 

where  the  bracket  symbols  denote  a  finite  time  average  of  duration  T. 

T 
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This  function  of  time  can  be  displayed  on  an  oscilloscope  or  in  any  other 
conveniently  readable  way. 


Although  this  display  feature  ha9  attracted  experimenters  to  the 
orthonormal  function  series  approximation  technique  for  estimating  first 
order  probability  density  functions*,  its  practical  utility  for  use  with  higher 
order  Statistics  would  be  highly  limited,  h  irst,  only  two-dimensional  cross 


/A 

sections  of  PnvJw  could  be  displayed, 


and  these  would  have  the  abscissa  defined 


by  x.  *  x  +  6  ,  j  *  1 »  .  .  .  ,  n,  where  6  is  a  constant  of  a  given  display.  A 

J  1  j  j 

more  important  difficulty  associated  with  this  display  method  for  use  with  higher 
order  statistics  is  the  complexity  of  equipment  required.  The  form  of  the  display 
device  for  n-th  order  functions  is  shown  in  Figure  4.  Equipment  requirements 
include  q  linear  filters,  nq  adjustable  delays,  and  qn  product  devices.  The 
output  of  each  linear  filter  is  delayed  n  (possibly  all  different)  times  and  these 
n  waveforms  are  routed  to  qn“*  different  product  devices.  In  the  special  case 
n*l,  no  product  devices  are  required;  however,  for  higher  order  statistics  the 
utility  of  this  display  technique  is  tremendously  outweighed  by  the  equipment 
complexity  involved. 
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*See,  for  instance,  [lO  ]. 


Figure  3.  A  Device  for  Displaying  p 


2.  2.  1.  3  Methods  of  Estimating  Averages  of  Functions  of  Signals 

To  clarify  and  justify  the  method  we  have  proposed  in  the  two  pre¬ 
ceding  subsections  for  estimating  averages  of  functions  of  stationary  processes, 
we  now  consider  a  rather  general  question  regarding  the  estimation  of  f(x). 
Specifically,  suppose  we  observe  a  single  member  of  a  stationary  random 
process  {fc(t)}  for  a  limited  observation  time  T.  What  is  the  best  method  of 
processing  this  data  to  obtain  an  estimate  of  f(x)  ?  Clearly  the  answer  to  this 
question  will  depend  on  the  criterion  of  MbestM  and  on  the  class  of  allowable 
operators  on  the  process  (»(t)}.  We  will  consider  the  class  of  linear  operators, 
and  utilize  the  mean  square  error  as  an  indication  of  estimation  error. 


As  an  example,  suppose  we  are  trying  to  estimate  the  mean  of  the 
process  (s(t)},  according  to  the  linear  operator 


m  *  J  h(t)  s(t)  dt.  (23) 

T 

(Notice  that  as  a  special  case,  if  h(t)  were  a  comb  of  impulses,  m  would  be 
formed  as  a  sum  of  samples;  thus,  sampling  is  a  subclass  of  the  operators 
we  are  considering.)  In  order  that  m  be  an  unbiased  estimate,  we  should  have 
(since  {s(t)}  is  a  stationary  process) 


m 


^  h(t)  a(t)  dt  * 
T 


mdt  *  m 


(24) 


giving 

f  h(t)  dt  *  1.  (25) 

T 


The  problem  then  becomes  that  of  minimizing  the  variance  of  m  by  choice  of 
h(t)  subject  to  the  constraint  above.  A  calculus  of  variations  technique  yields 
an  integral  equation  for  the  optimum  h(t): 

f  h(r)  R(t-r)dT  »  C,  t  e  T 
T 


(26) 


T 

i 


where  R(t)  is  the  autocorrelation  function  of  s(t)  and  C  is  a  constant.  Now 
R(t)  decays  to  a  steady  value  for  values  of  r  approximately  or  greater, 
where  W  is  the  bandwidth  of  the  process  {  s(t)  }  .  Then  if  T  >>  -  ,  an 
approximate  but  very  good  solution,  except  for  negligible  end  effects,  is 


h(r) 


T  ’ 


r  c  T. 


(27) 


This  is  not  the  exact  solution;  however  for  TW  >>  1,  it  performs 
just  about  as  well.  *  This  solution  indicates  that  one  should  not  sample  the 
waveform  s(t)  at  all,  but  use  all  of  it  according  to 


A 

m 


T 


(28) 


It  may  then  be  shown  that  the  estimation  error  is  approximated  by 
a2 (A)  S'  i  j  [  R(t)  -  m2]  dr.  (29) 


The  same  result  is  obtained  for  the  problem  of  estimating  the  average 
value  of  any  function  of  sjt),  if  R(t)  and  W  are  interpreted  as  the  correlation 
function  and  bandwidth,  respectively,  of  the  function  of  g  (t).  Thus  the  optimum 
linear  method  of  estimating  f  fg^(t)]  is  to  integrate  the  function  over  the 
available  time  T,  and  divide  by  T. 


To  obtain  some  idea  of  the  processing  time,  T,  required  to  obtain 
an  accurate  estimation  with  this  method,  we  now  consider  an  alternate  method 
which  provides  similar  (but  less)  accuracy.  Specifically,  suppose  that  the 
n-th  moment  of  a  stationary  process,  /u  n»  is  estimated  by  sampling  the  n-th 
power  of  the  process  periodicklly  and  averaging  the  sample  values.  This  is  a 


linear  estimation  procedure  with  h(t)  * 


N 


-j  ^  5(t-kT),  i.  e. 
k*  1 


* 


See  [  1  ]  and  [  4  ] 
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A 


n 


(30) 


where  {  s^.}  is  the  sequence  of  N  samples.  This  estimate  is  unbiased,  and  if 
the  N  samples  are  all  statistically  independent,  the  variance  of  the  estimate 
i  s  given  by 


2  *A  \ 
G  (mJ 


1  /  2  \ 
N  ^2n“  * 


(31) 


Now  we  do  not  know  the  (Mn)  »  in  fact,  we  are  trying  to  approximate  them. 
However  we  can  get  a  rough  idea  of  how  the  variances  a2(/i  )T  depend  on  n  and 
N  by  assuming  specific  forms  for  p^(x)  and  calculating  the  dependence.  In 
this  manner,  by  combining  the  results,  a  rough  quantitative  estimate  of  the 
number  of  samples  N  necessary  to  evaluate  the  various  moments  will  be 
obtained.  We  shall  here  assume  only  one  particular  form,  Gaussian, 


For  a  Gaussian  distribution  (zero  mean) 


|  0,  n  odd 

M  ■  f  (32) 

I  a  (n- l)(n-3). . ,  (1),  n  even 

2 

where  cr  is  the  variance  of  the  process.  The  variance  of  the  estimate  is  then 
obtained  by  substituting  in  the  formula  above.  The  relative  accuracy  is,  for 
n  even, 


Thus  the  number  of  terms,  N,  necessary  to  obtain  accurate  estimates  oi 
increases  as  n  increases*  We  can,  from  the  above  formula,  however,  determine 
just  how  many  samples  are  necessary  to  include  higher  order  terms.  For 
example,  to  include  the  sixth  order  term  with  a  relative  accuracy  of  5  percent, 

18,  000  terms  are  necessary;  of  course,  this  number  of  terms  gives  better 
accuracy  for  lower  order  terms*  (For  n  odd,  values  of  N  intermediate  between 
values  for  neighboring  even  n  values  will  be  sufficient,) 
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For  other  forms  of  density  functions,  the  number  of  terms,  Nf 
necessary  for  a  prescribed  accuracy  will  differ  from  those  above,  although 
not  widely.  For  example,  assumption  of  an  exponential  p.  d.  f,  leads  to 
values  of  N  different  by  roughly  a  factor  of  two  for  low  n. 


An  alternate  method  of  establishing  when  a  particular  sample 
size  N  is  adequate,  which  makes  no  presumptions  about  the  form  of  the  p.  d.  f. 
being  estimated,  is  now  discussed.  This  method  has  not  been  studied  exten¬ 
sively,  but  it  is  a  powerful  and  important  technique  which  merits  consideration. 
Suppose  we  are  attempting  to  approximate  n  .  The  variance  of  the  estimate  is 
then 


2 

a 


N 


(34) 


Now  we  may  put  a  bound  on  this  quantity  as  follows:  for  a  p,  d.  f.  with  a 
limited  dynamic  range  and  prescribed  second  moment,  we  must  have 


^  4  ~  ^  2  B 

where  B  is  the  maximum  value  of  I  x  |  .  This  may  be  seen  by  noticing  that 
B  B 

*  §  x4p^(x)  dx  <  ^  B2x2p((x)  dx  «  B2  \x 


(35) 


(36) 


-B 


-B 


and  in  fact  may  be  realized  by 

2 


P*  (*)  I  1 


1  - 


1  6  (x)  +  2—  4(x  _  B). 

B  /  B 


(37) 


Therefore 


a2^  <  i  M2(B2  -m2). 


(38) 


and  a  bound  on  the  variance  of  the  estimate  is  obtained  as  a  function  of  the 
statistic  itself.  Therefore,  as  N  is  increased,  and  begins  to  stabilize, 
this  value  may  be  substituted  in  the  right-hand  side  of  the  above  equation  to 
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also  place  a  bound  on  the  variance.  Thus,  estimation  of  carries  along 
with  it  an  estimate  of  the  variance  of  estimation!  Notice  that  only  a  very 
crude  pre-estimate  of  jU£  need  by  evaluated  this  way.  For  example,  we 
might  take  »  100,  obtaining  For  a  prescribed  a2(42),  the  above 


equation  might  then  indicate  N£  *  500,  As  a  check  on  this  sample  size,  we 
can  lastly  compute,  for  *  500,  o^(j5  and  see  if  this  is  satisfactory. 


Notice  that  this  method  requires  no  presumption  about  the  p.  d.  f,  form,  except 
for  B,  the  dynamic  range,  which  can  be  quickly  and  easily  evaluated. 


This  method  of  bounding  the  variance  of  an  estimate,  by  using 
approximate  values  of  the  statistic  itself,  can  be  extended  to  higher  order 
moments.  It  is  expected  that  the  bounds  obtained  become  relatively  weaker 
as  the  order  increases. 


As  an  indication  of  the  number  of  statistically  independent  samples 
which  may  be  obtained  from  a  process  in  time  T,  we  may  take  the  Nyquist 

rate:  t  *  .  Thus,  the  relative  error  in  estimation  of  the  n-th  moment 

c  W 

of  a  process  {  s(t)}  with  bandwidth  W,  is  approximately 


<t(S  ) 

n 


(39) 


The  inclusion  of  higher  moments  in  an  approximation  to  pj(x)  is  thus  seen  to 
impose  a  longer  processing  time  for  each  moment  in  the  estimation  procedure. 
For  instance,  (39)  indicates  that  to  estimate  second  and  fourth  order  moments 
of  a  10  kc  bandwidth  process  with  a  one  percent  relative  error  requires 
T  *  0,  7  second  and  T*  5,7  second,  respectively.  For  a  100  cps  process,  these 
times  are  approximately  1  and  9  minutes,'  respectively. 


The  difference  between  the  optimum  and  sampling  estimation  methods 
is  probably  not  very  great.  For  instance,  for  R(t)  parabolic  between  the 

origin  and  r  *  ,  it  can  readily  be  shown  that 

c,  W 
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9*  (fi  j)  (sampling  method)" 


(40) 


2 

a 


)  (optimum  method-) 


3 

2 


For  higher  order  moments  the  difference  may  be  larger,  but  it  is  expected 
that  (40)  provides  a  reasonable  estimate  of  the  relative  error  associated  with 
an  estimate  of  the  n-th  order  moment  of  a  process. 
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2*2,2  Success  Counting  Methods 


Another  approach  to  the  problem  of  estimating  either  p  (x)  or  P  (x) 
consists  of  simply  counting  the  number  of  "successes"  in  a  number  of  trials' 
of  an  experiment.  For  any  process(es)  it  would  be  possible  to  obtain  an 
n-tuple  sample,  s  -  (si  ,  s~  ,  .  „  ,  ,  s  ),  by  simply  initiating  the  process 

and  recording  the  sample  values  taken  at  the  appropriate  times  (s^  at  time 
t^(  etc,  ).  In  general,  N  statistically  independent  n-tuple  samples  of  the 
process  s  *  (s^,  s^2  '  *  *  *  '  skn^  '  k  =  1 ,  2,  .  .  ;  ,  N ,  could 
be  obtained  by  reinitiating  the  process  after  each  sample.  In  this  manner,  it 
would  be  possible  to  perform  an  estimation  of  pn(x)  based  on  an  examination  of 
N  members  of  the  ensemble  of  waveforms  constituting  the  process.  Specifically, 
let 


f  1  if  s,  falls  in  R  A 
J  ^k  A 

I  0  if  s.  falls  outside  R 

l  ~k  *  * 


where  the  region  R  contains  all  values  of  x?  such  that 
^  ^ 


(41) 


max 

i<£< 

and  A  is  a  (small) 


n 

positive  number.  An  estimate  of  p  (x)  is  provided  by 


(42) 


1 

AnN 


N 


k*l 


(43) 


This  estimate  is  the  number  of  samples  which  fall  in  R  ,  divided  by  the 

total  number  of  samples  (properly  normalized  for  the  given  A).  One  justifica¬ 
tion  for  this  method  is  provided  by  the  law  of  large  numbers*,  which  states  that 


*(  2  ],  Chapter  X. 
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0 


(44) 


N 

7 

Pr< 

| 

1  V  J 

>  c 

j 

k»l  R 

i 

L 

> 

as  n  _►  oo  ,  where  t  is  any  positive  number, 

Now  this  method  of  estimating  the  value  of  pn(xj  at  any  given  point  x  (as 
contrasted  with  earlier  methods  of  fitting  a  curve  to  pn(x^).  over  all  :£)  can 
be  applied  to  all  processes  for  which* 

N 

E  X  ‘v7?2- -0  W5' 

k*l 

as  N— **>. 


A  possible  drawback  to  this  brute  force  approach  is  that  considerable 
time  may  be  required  to  generate  each  samplei  and  therefore  an  exhorbitant 
delay  may  be  encountered  in  completing  the  estimation  of  pn(x)  a*  many  points  x. 
If  the  process  is  nonstationary,  then  there  may  be  no  substitute  for  this  direct 
approach*  However,  if  the  process  is  stationary,  then  instead  of  dealing  with 
samples  from  many  different  members  of  the  ensemble  of  time  functions,  an 
estimate  could  be  based  on  samples  taken  from  a  single  member  of  the  ensemble* 

The  question  naturally  arises  as  to  whether  other  procedures  (besides 
sampling)  would  be  better  for  processing  T  seconds  of  a  set  of  signals  a^(t)  to 
obtain  an  estimate  of  pn(x).  Following  the  success  counting  approach  we  now 
consider  this  question,  restricting  the  processing  methods  to  linear  operations* 
Specifically,  to  estimate  p  (x)  as  a  constant**  in  the  region  R  .  we  define 


p*(x)  5 


(46) 


Ibid, ,  p*  238. 


** 


This  estimation  will  produce  what  is  called  a  histogram  approximation  to 


-28- 


A  "success"  counting  function  can  again  be  defined  such  that 


(  1,  if  s  falls  in  R  A 

«sH  ~ 

(  0,  if  s  falls  outside  R 

V,  x , 


(47  f 


where  the  n-tuple  s  is  actually  a  function  of  time. 

The  function  1{b)  registers  whenever  s^(t)  falls  in  the  region  R^  ^  Again  let 

A  be  a  (small)  positive  number*  Then,  the  most  general  linear  operation  on 
the  limited  data  provided  by  f[  s(t)l  in  an  interval  T  is 


$n(x)  -  j  h(t)  f[t(t)]  dt 


(Notice  again  that  if  h(t)  were  a  comb  of  impulses,  $n(j£,)  would  be  a  sum  of 
samples  --  the  usual  "counting”  method.)  An  unbiased  estimate  has 


PnM  =  P^(x)  =  ^  h(t)  f  [s^t)J  dt  ■  f(sj  jj  h(t)  dt 
T  T 

■  J  1  *  P(x)  dx  J  h(t)  dt  ■  Pr  (s  Rx  A)  J  h(t)  dt 


R 


»  A 


(48) 


(49) 


Using  eq.  (46),  this  provides  the  constraint: 


^  h(t)  dt  -  1/An  (50) 

T 

2  .i- 

If  we  minimise  a  [^»n(x )]  by  choice  of  h(t),  we  obtain  the  following  integral 
equation  for  the  optimum  h(t): 

h(r)  Rf(t-r)  dr  -  c,  t  «  T  (51) 

T 
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where 


(52) 


I 

] 


Rf(t-r) 


f  (  ■,(*)  1  f  t  t(T)  ]  ■ 


Again,  the  correlation  of  s  (t)  extends  only  over  roughly  an  interval  ,  where 

^2  W 
W  is  the  bandwidth  of  s  (t),  If  T  >>  —  ,  an  approximate  solution  for  h(t)  is 

***  w 


h(t)  ■  — — —  »  t  c  T. 

TA 


(53) 


Thus  when  TW  >>  1  (many  Nyquist  intervals  in  the  observation  interval), 
optimum  estimation  is  obtained  by  the  equation 


_ 

» n  __ 

A  T 


J  f[s,(t)]  dt. 

T 


(54) 


This  estimate  is  the  percentage  of  time  that  s^(t)  e  ^  while  t  t  T,  weighted 

according  to  the  value  of  A  chosen* 


In  summary,  if  TW  >>  1,  the  best  way  to  use  a  given  amount  of 
continuous  data  ^i.  e.  ,  a  segment  of  s^(t)V  for  estimating  p*(x^)  is  to  weight 
all  of  it  equally. 


To  implement  this  method  would  require  the  construction  of  a  device 
consisting  of  an  integrator  followed  by  a  multiplier.  The  integrator  would  be 
required  to  have  a  wide  dynamic  range  in  both  output  voltage  and  integration  time. 
The  output  voltage  dynamic  range  is  a  function  of  the  dynamic  range  of  the  proba¬ 
bilities  to  be  measured.  A  40  db  range  would  be  a  minimum  requirement.  The 
integration  time  is  a  function  of  (a)  the  bandwidth  of  the  signals  being  measured, 
(b)  desired  accuracy  requirements,  and  (c)  the  probability  being  measured. 


The  same  result  was  obtained  for  other  statistics  (averages  of  functions  of 
x)  of  stationary  signals  in  Section  2.  2,  1.  3  above. 
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As  the  number  of  cells  (used  to  cover  the  signal  space)  increasesi 
the  probability  that  a  set  of  signals  will  fall  in  a  given  cell  decreases;  e.  g. , 
for  a  4-th  order  uniform  probability  density  function  with  3  bit  quantization 
in  each  dimension,  the  probability  that  the  input  signals  will  fall  into  a  given 
cell  is  1/4096.  If  such  cells  a 8  well  as  those  with  probability  near  unity  are 
of  interest,  and  if,  in  addition,  a  wide  range  of  signal  bandwidths  are  to  be 
considered,  then  the  dynamic  range  of  the  integration  time  would  be  greater 
than  10^.  The  construction  of  an  integrator  within  tolerances  imposed  by 
these  requirements  would  be  a  formidable  task. 


Fortunately,  the  difficulties  associated  with  the  optimum  success 
counting  method  of  estimation  can  be  overcome  through  the  use  of  a  non¬ 
optimum,  but  still  efficient  technique  involving  sampling  the  signals,  sjt). 
Specifically,  the  optimum  method  of  estimating  the  probability  that  s(t) 
falls  in  any  specified  region*,  R  ,  may  be  approximated  by  a  generalTza- 
tion  of  eq.  (43): 


Pr  {s  c  R  } 


where  f(s)  *  1  if  s  e  R  ,  and  f(s)  =  0  if  s  R, 


(55) 


That  this  estimate  is  nearly  equivalent  to  the  optimum  one  is  indicated  by 
equation  (4(^).  The  utility  of  this  method  follows  from  the  use  of  digital 
circuitry.  The  most  significant  advantage  derived  from  the  use  of  digital 
circuitry  is  in  the  flexibility  of  the  digital  integrator  (counter):  the  dynamic 
range  and  resolution  can  be  doubled  by  the  addition  of  a  single  stage,  the 
integration  time  caA  be  varied  either  by  the  addition  of  counter  stages  or  by 
changes  in  clock  frequency;  and  there  are  no  drift  problems.  Some  fringe 
benefits  derived  from  the  use  of  digital  circuitry  are:  (a)  the  success  regions 
can  be  conveniently  stepped  automatically  from  one  cell  to  another;  (b)  the 
results  can  be  easily  presented  in  a  form  convenient  for  entry  to  a  computer; 
(e)  the  output  data  can  be  easily  normalized  to  provide  equal  resolution  of 
both  low  and  high  probabilities. 


2.2,2.  1  Parallel  Processing  With  a  Digital  Computer 

Having  decided  that  periodic  samples  of  a  set  of  signals  s^(t)  will 
form  the  basis  of  an  estimation,  there  are  two  alternative  ways  of  processing 
these  samples.  The  first,  called  parallel  processing  (Figure  5),  consists  of 
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32 


sampling  s^(t),  analog  to  digital  conversion  of  the  sample  values,  changing 

the  digital  formatf  and  inserting  the  resulting  data  into  a  general  purpose 

digital  computer.  The  computer  would  then  be  programmed  to  count  all 

the  successes  occurring  in  specified  regions  to  produce  an  estimate  of 

p  (x^»  P  (x)  or,  in  general,  Pr{s  e  R} 
n  ^  n  ~ 


To  estimate  the  practical  limitations  imposed  on  this  technique  by 
the  characteristics  of  a  general  purpose  computer,  suppose  the  available 
main  storage  capacity  is  equal  to  4096  computer  words,*  With  this  capacity, 
the  data  will  have  to  be  read  into  the  computer  in  segments.  Assuming  that 
half  of  the  storage  capacity  is  used  for  recording  a  histogram  estimate  of 
pn(£)»  and  the  other  half  is  used  to  store  a  segment  of  data,  the  signal  space 

X  may  be  partitioned  into  at  most  A?P..Pty,  cells,  where  each  computer  word 

v 

is  composed  of  w  bits,  and  pn(x)  is  recorded  with  v  -bit  accuracy  per  cell. 
With  the  CDC  160  computer,  w  *  12  bits  per  word,  and  the  total  number  of 
cells  allowable  by  a  6-bit  quantization  of  pn(x^  in  each  cell,  is  approximately 
212  cells.  To  estimate  an  n-th  order  probability  density  function  with  a 
histogram  based  on  a  cell  structure  with  q-bit  accuracy  in  each  variate,  a 
total  of  2ncl  cells  are  required.  Thus,  we  obtain  the  restriction: 


nq  5  12 


(56) 


A  value  of  q  *  3  would  seem  to  be  a  minimum  resolution  capability;  this  quanti¬ 
zation  would  allow  for  the  computation  of  up  to  fourth  order  probability  density 
functions. 


The  time  required  to  collect  and  process  data  with  this  technique  is, 
of  course,  largely  determined  by  the  computer  utilized.  However,  the  cost 
of  operating  the  computer  may  be  approximately  the  same  for  different  com¬ 
puters.  As  an  illustration,  consider  the  Recomp  II  and  the  problem  of 
estimating  a  fourth  order  probability  density  function  with  3-bit  quantization 
per  variate,  and  5-bit  accuracy  of  representing  pn(x)  in  each  cell.  By 
assigning  two  of  the  2**  ^  *  4096  cells  to  a  single  computer  word,  approxima¬ 
tely  half  of  the  main  storage  (2048  words)  is  made  available  for  raw  digital 
data  input.  Since  each  sample  consists  of  12  bits,  approximately  6000  samples 


As  is  the  case  with  the  Packard  Bell 
other  computers. 


440,  Recomp  II,  CDC  160,  and 
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of  data  can  be  handled  in  a  single,  continuous  segment.  The  input  data  rate 
is  limited  for  this  computer  to  60  words  per  second,  or  180  samples  per 
second;  thus  each  segment  of  data  can  be  read  into  th^  computer  in  approxi¬ 
mately  thirty  seconds.  If  2^  samples  are  processed  ,  then  approximately 
80  segments  of  data  would  have  to  be  processed.  The  total  read-in  time 
would  therefore  be  less  than  one  hour.  The  processing  time  to  calculate 
probabilities  would  probably  be  at  most  an  order  of  magnitude  longer. 


Implementation  of  the  parallel  processing  technique  requires  only 
the  construction  of  sampling,  analog-to-digital  conversion,  and  digital  format 
conversion  equipment.  However,  a  possibly  major  drawback  to  this  method 
of  probability  estimation  is  its  inherent  reliance  on  a  general  purpose 
computer.  If  a  computer  is  not  available,  then  a  probability  estimation 
cannot  be  accomplished.  Another  unattractive  feature  of  this  method  is  the 
requirement  to  obtain  and  store  a  large  quantity  of  data  in  the  process  of 
obtaining  a  probability  estimate.  Unless  the  signal  samples  can  be  fed 
directly  into  the  computer,  some  form  of  intermediate  storage  is  required. 
Assuming  again  the  necessity  for  2^  samples*,  and  using  3-bit  quantization 
of  each  of  four  signal  amplitudes,  more  than  6  million  bits  would  have  to  be 
stored  for  each  probability  calculation.  If  paper  tape  is  used  as  the  inter¬ 
mediate  storage  medium  with  6-bit  characters,  then  more  than  eight  thousand 
feet  of  tape  (or  sixteen  500  foot  rolls)  would  be  required  for  each  probability 
calculation.  In  addition  to  the  possibility  of  introducing  errors,  the  increased 
processing  time  for  punching  and  reading  the  data  would  be  significant  -- 
as  much  as  5  hours  for  punching. 


Because  of  the  desire  (for  this  study)  to  make  the  probability  analyzer 
independent  of  other  computational  aids,  and  the  intermediate  storage  problem, 
we  have  chosen  the  other  method  of  implementing  the  success  counting  proba¬ 
bility  estimating  technique,  namely  serial  processing  with  a  completely  self- 
contained  device. 


A  number  estimated  (in  Section  2.  3.2)  to  be  sufficient  for  a  cell  structure 
with  4096  cells. 
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2,2.2.  2  Serial  Processing  with  a  Self-Contained  Device 

Construction  of  a  self-contained  device  for  simultaneously  calcula¬ 
ting  the  number  of  samples  which  fall  into  each  of  a  large  number  of  cells 
in  signal  space  is  precluded  by  its  cost,  if  the  number  of  cells,  c,  is  very 
large  (as  it  will  be  for  n  >  1).  We  must  therefore  be  content  with  a  device 
which  utilizes  a  period  of  time,  T,  for  estimating  the  probability  that  s 
falls  in  a  single  cell,  and  repeats  the  process  for  every  other  cell  examined. 
The  block  diagram  of  such  a  serial  estimation  device  is  shown  in  Figure  6. 
The  first  operation  performed  on  s^(t)  (by  the  success  region  detector)  is  the 
registration  of  intervale  during  which  s^(t)  falls  in  an  n-dimensional  region,  F, 
as  dictated  by  equation  (55).  The  result  of  this  operation  is  a  success 
waveform,  which  takes  on  the  value  1  at  times  when  s(t)  c  R  ,  and  zero  at 
other  times.  This  waveform  is  sampled  periodically* (roughly  at  the  Nyquist 
rate)  until  a  specified  number  of  samples,  or  trials,  have  been  taken.  The 
number  of  successes  obtained  and  samples  taken  are  recorded  in  a  Success 
Counter  and  Trial  Counter,  respectively.  By  digital  operations  on  the 
contents  of  these  two  counters,  a  number  representing  the  ratio  of  successes, 
ng,to  total  number  of  trials,  n^,  is  obtained.  This  quantity  is  the  estimate 
of  the  probability  that  s  c  R, 


The  flexibility  of  this  type  of  device  is  unsurpassed  by  any  of  the 
other  methods  of  estimating  higher  order  statistics  of  signals.  By  adjust¬ 
ment  of  the  region  R,  any  one  of  p  (x),  P  (x),  or  in  general, ^r  {s  *  R} 

n  ~  n  ~  ^ 

may  be  calculated.  Also,  there  is  no  limitation  to  stationary  signals  (although 
samples  may  no  longer  be  obtained  at  the  Nyquist  rate  from  nonstationary 
signals*).  Perhaps  most  important,  either  partial  or  preliminary  results 
can  be  obtained  without  having  to  perform  a  complete  calculation  of  pn(x): 
the  value  of  pn(x^)  cfm  be  ascertained  in  a  few  selected  cells,  or  the  cell  size 
can  be  set  at  a  large  value  to  obtain  a  coarse,  but  possibly  informative, 
preliminary  histogram. 


With  this  method  of  probability  estimation,  a  significant  intermediate 
data  storage  problem  will  never  arise,  even  if  further  processing  of  the 
results  of  an  estimation  are  desired.  At  worst,  the  results  of  a  probability 
density  estimation  will  produce  (cv)  bits  of  data,  where  c  is  the  total  number 
of  cells  and  pn(x,)  is  represented  by  v-bit  quantized  numbers.  Assuming  the 
previously  mentioned  numbers  (c  «  ^  and  v  ■  6),  storage  of  a  complete  pro¬ 
bability  density  function  would  require  less  than  one-tenth  of  a  roll  of  tape. 


The  effects  of  nonstationarity  of  signals  on  accuracy  of  estimation  are 
discussed  in  Section  2.  3.  4. 
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The  success  region  detector  can  be  implemented  in  many  different 
ways.  One  fairly  general  approach  to  this  problem  is  outlined  in  the  next 
subsection.  Following  this  is  a  statement  of  the  manner  in  which  the  rate 
at  which  samples  should  be  taken  from  a  set  of  signals  to  minimize  pro¬ 
cessing  time.  Following  this,  in  Section  2.  3,  is  a  discussion  of  sources 
and  potential  magnitudes  of  errors  associated  with  the  serial  processing 
success  counting  method  of  estimation. 


2.2.2.  3  Establishment  of  a  Success  Region 

As  indicated  in  Figures  5  and  6,  the  initial  step  toward  estimating 
the  probability,  Pr  {  a  t  R},  of  a  set  of  signals  falling  in  a  given  region,  R, 
is  the  establishment  of  a  success  region  detector.  In  practice,  the  region  of 
interest  may  take  on  a  variety  of  forms.  In  error  probability  calculations, 
for  instance,  it  is  often  the  case  that  R  is  defined  as  all  values  of  x  for  which 
Xj  £  *.*  j  *  2, . . .  ,  n.  Another  region  of  interest  is  that  for  which 


n 


In  general,  it  is  desired  that  a  success  region  detector  be  flexible 
enough  to  accommodate  all  conceivable  forms  of  the  region,  R,  but  of  course 
this  is  precluded  by  the  cost  of  such  a  device. 


One  compromise  which  is  relatively  easily  implemented  consists  of 
defining  regions  with  hyperplanes,  i.  e.  ,  linear  inequalities.  With  this 
method  a  region  R  is  defined  as  all  points  x  for  which 

n 

x.  a >  C.,  j  *  1,  2,  •  •  •  ,  m,  (57) 

k  jk  J 

k«l 

where  the  {a^}  constitute  a  set  of  mn  constants  which  correspond  to  the  region, 
R,  To  indicate  whether  a  sample  value  s  falls  in  a  given  region,  R,  it  is  sufficient  to 
build  m  resistive  adders  and  comparaTors,  whose  outputs  are  routed  to  an  AND 
gate.  A  device  for  implementing  these  operations  for  n*2  and  m*3  is  indicated 
in  Figure  7. 
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The  accuracy  with  which  a  region  R  can  be  approximated  using 
hyperplanes  is  quite  dependent  on  the  shape  of  the  region*  Obviously  any 
region  which  is  bounded  by  hyperplanes  can  be  implemented  perfectly. 

For  instancef  the  regions  associated  with*  probability  distributions  or  histo- 
gram  estimates  of  probability  density  functions  (with  hypercubic  cells)  can 
be  realized  precisely,  if  the  number  of  hyperplanes,  m,  is  greater  than  or 
equal  to  the  number  of  variates,  n,  for  distribution  functions,  or  twice  that 
number,  2n  for  density  functions.  For  regions  bounded  by  curved  surfaces, 
however,  errors  in  specifying  the  regions  will  result. 


As  an  indication  of  the  number  of  hyperplanes  required  to  approxi¬ 
mate  success  regions,  the  ratios  of  the  volume  of  inscribed  and  circum¬ 
scribed  hyperBpheres  to  the  volume  of  a  regular  n-tope  are  shown  in  Table  1 
for  n  *  2  and  n  «  3.  For  n  *  2,  the  approximating  region  is  a  regular  polygon, 
and  for  n  *  3  the  approximating  region  is  a  regular  polyhedron  (of  which  there 
are  only  5). 


Table  1*  Ratios  of  Volumes  of  Polytopes  and  Hyperspheres 
for  n*2,  n«3,  and  Several  Values  of  m 


Polygon  Approximation 

Polyhedron  Approximation 

m 

Circum  scribed 

Inscribed 

Circumscribed 

Inscribed 

3 

2.42 

.  62 

4 

1/57 

.79 

8.  15 

.  30 

5 

1.  32 

.  86 

6 

1.21 

.91 

2.  72 

.  52 

7 

1.  14 

.93 

8 

1.  11 

.95 

3.  14 

.60 

9 

1.09 

.96 

10 

1.07 

.97 

11 

1.06 

.97 

12 

1.05 

00 

0 

• 

1.  50 

.  75 

20 

1.00 

1.00 

1.65 

.82 

-39- 


For  two  dimensional  regions,  a  dozen  straight  lines  would  probably  suffice 
for  most  regions  of  interest,  but  for  three  or  higher  dimensional  regions, 
perhaps  a  few  dozen  hyperplanes  might  be  needed  for  accurate  representa- 
tion  of  an  arbitrary  region.  In  practice,  however,  by  using  only  m  *  2n 
hyperplanes,  a  histogram  can  be  constructed  over  which  an  integration 
could  be  carried  out  for  any  region,  with  accuracy  limited  only  by  the  cell 
size  chosen  for  the  histogram.  Thus,  construction  of  a  success  region 
detector  with  more  than  2n  hyperplanes  would  probably  not  be  justified 
unless  a  specific  region  is  to  be  investigated  for  a  variety  of  inputs. 


2.  2.  2.  4  Sampling  Rate  Adjustment 

In  practice,  selection  of  an  appropriate  sampling  rate  poses  a 
problem.  If  a  set  of  signals,  s  (t),  is  sampled  at  a  high  rate,  dependent 
data  are  obtained  which  carry  very  little  statistical  information,  perhaps 
even  in  a  large  amount  of  data.  On  the  other  hand,  sampling  at  too  low  a 
rate,  although  yielding  independent  data,  requires  a  long  data  collection 
time.  The  guide  to  selection  of  an  intermediate  sampling  rate  is  given 
by  the  sampling  theorem*;  if  a  process  is  bandlimited  to  W  cycles  per 

second,  samples  taken  seconds  apart  just  suffice  to  reconstruct  the 

time  function  exactly.  That  is,  this  rate  of  sampling,  2W  samples  per 
second,  does  not  miss  any  of  the  ’’important  changes”  in  the  time  function, 
yet  does  not  yield  a  large  amount  of  superfluous  data. 


In  order  to  apply  this  theorem  to  the  problem  at  hand,  the  bandwidth 
of  the  process  must  be  known.  A  quick  rough  overestimate  6f  the  bandwidth 
W#  perhaps  by  means  of  a  spectrum  analyzer,  would  suffice  to  allow  for  proper 
adjustment  of  sampling  rate. 
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2.3 


ACCURACY  ATTAINABLE  WITH  THE  SUCCESS  COUNTING  METHOD 


The  accuracy  with  which  a  probability  distribution  or  density  function 
can  be  estimated  using  the  success  counting  method  is  limited  by  the  quantization 
of  signal  space,  the  processing  time,  and  equipment  inaccuracies. 


2.3.1  Quantization  Error 


Consider  first  the  effect  of  quantization.  If  unlimited  processing  time 

is  available,  and  no  equipment  inaccuracies  exist,  then  the  true  probability  that 

a  set  of  signals  lies  in  a  particular  region  R  in  signal  space  canbe  made  arbitrarily 

close  to  the  ratio  of  (a)  the  number  of  trials  which  produced  values  of  the  signal 

set  within  the  region,  n  ,  to  (b)  the  total  number  of  trials,  n  .  Explicitly,  if  n 

s  t  t 

can  be  made  arbitrarily  large,  then  the  law  of  large  numbers  states  that  it  is 

possible  to  make 


(58) 


for  any  given  <  >  0,  and  any  region  R  in  signal  space.  The  histogram  height,  h, 

n 

8 

for  a  given  cell  in  signal  space  is  defined  as  h  *  —  *  where  Av  is  the  volume 

of  the  cell.  The  inequality  (58)  indicates  that  the  volume  under  the  histogram  for 
a  cell  is  the  same  as  the  volume  under  the  true  probability  density  function,  i.  e. , 
the  true  probability  of  the  signals  falling  in  the  cell. 


For  any  region  in  signal  space  whose  boundary  lies  only  on  boundaries 
between  cells,  the  probability  of  a  set  of  signals  falling  in  this  region  can  be 
estimated  arbitrarily  closely  with  the  success  counting  method.  However,  for 
a  region  whose  boundary  deviates  from  cell  boundaries,  the  estimated  probability 
of  a  set  of  signals  falling  in  the  region  (obtained  by  integrating  under  the  histogram) 
can  differ  from  the  true  probability.  To  obtain  some  feeling  for  the  adequacy  of  a 
given  quantization  and  cell  structure,  we  may  introduce  a  quantity  which  provides 
some  indication  of  the  degree  to  which  the  estimated  probabilities  (obtained  with 
histograms)  differ  from  the  true  probabilities.  One  such  quantity  is 


(59) 
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where 


V  * 


P  (x)  « 

n  ~ 


the  total  volume  of  signal  space 
R1  R2'*,Rn 

Pr  {8l<  *lf  e2  1  x2 . sn<  xn} 

j  h  («)  d  s 


(8i<xif  i-  1,  2 . n) 


and  R  denotes  the  region  in  which  signals  can  arise.  The  quantity  D  measures 
the  average  absolute  deviation  of  the  estimated  probability,  P^(x),  from  the  true 

probability  distribution,  P^(x).  For  simplicity  of  illustration,  consider  the  single 

variate  case,  and  assume  that  the  range  is  partitioned  into  c  cells  having  the 
same  width,  A.  Then,  the  distribution  function,  P^(x),  is  a  function  of  the  single 
variable,  x,  and  a  histogram  can  be  constructed  which  provides  a  piecewise  linear 
approximation  to  Pj(fc),  as  indicated  in  Figure  8  ,  with  c  *  8.  As  noted  above, 
the  estimated  probability  is  equal  to  Pj(x)  for  x  *  iA,  i  *  1,  2,  .  . .  ,  c,  and  gen¬ 
erally  differs  from  Pj(x)  at  other  values  of  x. 


The  value  obtained  for  the  quantity  D  is  dependent  on  the  probability 
distribution  function.  From  the  standpoint  of  maximizing  D,  the  worst  probability 
distribution  consists  of  a  staircase  function  with  jumps  at  the  cell  boundaries,  as 
indicated  by  the  dashed  curve  in  Figure  8,  For  this  distribution  function,  the 
average  absolute  difference  between  the  true  probability  distribution  and  the  es¬ 
timated  value  of  this  quantity,  is  equal  to^_.  Thus  to  obtain  an  average  deviation 

of  less  than  0.  1  (in  probability),  it  is  necessary  only  to  choose  c  >  5.  In  general, 
for  an  n-variate  distribution,  the  average  absolute  difference  between  the  true 
probability  distribution  and  the  estimated  quantity  is  less  than  or  equal  to 


i  [*  -  iar 


where  c  is  the  total  number  of  (equi-volume)  cells  in  the  signal 


space. 


2.  3.  2  Error  Caused  By  Finite  Processing  Time 

As  pointed  out  earlier,  if  unlimited  processing  time  is  available,  then 
the  ratio  of  the  number  of  "successes”  to  the  number  of  samples,  or  trials,  o&n 
be  made  arbitrarily  close  to  the  probability  of  a  set  of  signal  amplitudes  falling 
in  a  region  R.  The  difference  between  the  estimated  and  actual  values  will  tend 
to  be  greater  for  shorter  processing  times. 
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Pr  {  signal  voltage 


Figure  ft.  Probability  Distribution  Function  Error  Introduced 
By  Quantization 


For  the  eucceea  counting  method  of  eatimationi  the  proceaeing  time,  T, 
can  be  related  to  the  number  of  atatiatically  independent*  aamplea  available  In 
that  time.  Specifically,  a  proceaa  {x(t)}  which  ia  limited  in  bandwidth  to  a  range 
W  provides  roughly  N  *  2TW  independent  aamplea  in  a  time,  T.  Therefore,  to 
determine  the  error  imposed  by  a  given  limited  processing  time,  it  ia  sufficient 
to  relate  the  error  of  an  estimation  to  the  number  of  independent  samples  used. 

A 

The  customary  method  of  measuring  the  error  of  an  estimate,  P,  of  a 
probability,  P,  is  in  terms  of  the  average  square  of  the  difference  between  these 
quantities: 


2 

e  * 


A 

(P  -  P) 


2 

a 


(60) 


where  is  the  number  of  trials  resulting  in  successes.  The  quantity  Nfl  has  a 
binomial  probability  den.  ity  function, 


N  /  v 

p(Ns)  *  £  J  Pk(l-P)N"k  6(Ns  -  k) 

k»o  ' 

and  the  quantity  e  is  readily  calculated: 


(61) 


(62) 


A  more  demanding  measure  of  error  which  would  reflect  the  relative  accuracy 
attained  for  any  probability  P#  is  provided  by 


where  S  ■  the  expected  number  of  successes  in  N  trials.  For  a  given  number  of 
trials,  e(63)  indicates  that  the  relative  error  is  much  larger  for  small  probabilities. 


It  would,  of  course,  be  desirable  to  be  able  to  know  in  practice  precisely 
the  number  of  samples  required  to  estimate  an  unknown  probability  with  a  specified 
relative  accuracy.  However,  from  (63)  we  know  only  the  relationship  between  the 


*[2  ].  P.  114. 
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error  and  the  (unknown)  true  probability.  One  way  to  obtain  an 

A  N8 

error  (in  estimating  P)  is  to  substitute  the  estimate,  P  *  — —  , 
for  e^,  obtaining 


estimate  of  the 
into  the  formula 


e  * 
r 


i 


N  -  N 

6 


N  N 

8 


(64) 


For  small  values  of  P  (the  more  difficult  or  time  consuming  range),  we 

would  hope  to  have  N  <<  N  and  (64)  would  then  become 

s 


Thus,  to  control  the  relative  estimation  error,  it  might  be  reasonable  and 
economical  to  operate  the  success  counting  device  for  the  period  of  time  it  takes 
to  produce  a  specified  number  of  successes. 


Although  this  approach  to  error  control  is  appealing  at  first  sight,  we 
have  decided  not  to  use  it.  The  primary  reasons  for  not  using  it  are  (a)  the  ex¬ 
pression  (65)  is  valid  only  if  the  accuracy  of  estimation  is  rather  high  anyway; 

(b)  the  actual  processing  time  would  be  a  randomly  fluctuating  quantity;  and  (3)  for 
signals  with  bandwidth  in  the  range  5  kcps  -  10  kcps,  the  error  can  be  controlled 
in  a  way  which  doesn't  involve  these  difficulties. 


A  convenient, and  perhaps  more  realistic,  method  of  controlling  the  esti¬ 
mation  error  due  to  limited  processing  time,  consists  of  simply  incorporating  a 
wide  dynamic  range  in  the  success  and  trial  counters.  If  the  input  signal  band¬ 
width  is  so  small,  or  the  number  of  cells  so  large  that  an  intolerably  long  pro¬ 
cessing  time  is  involved  in  computing  probabilities  with  the  maximum  number  of 
trials  available,  then  a  succession  of  calculations  using  fewer  trials  could  usually 
establish  the  number  above  which  the  estimate  can  be  expected  to  remain  essen¬ 
tially  unchanged.  While  this  procedure  would  economize  on  processing  time, 
there  may  exist  probability  calculations  which  cannot  be  accomplished  with  a 
given  maximum  number  of  trials  (except  by  combining  the  results  of  several 
computations  using  independent  data). 
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To  determine  the  maximum  number  of  trials  it  would  be  reasonable  to 
attempt  to  record  in  the  trial  counter,  we  must  consider  the  lowest  probability 
we  would  ever  want  to  compute.  A  reasonable  assumption  for  the  latter  quantity 
is  the  probability  of  a  set  of  signals  falling  in  a  single  histogram  cell  defined  for 
an  n-th  order  process,  where  n  is  the  highest  order  to  be  considered.  It  was 
suggested  earlier  that  it  would  not  be  reasonable  to  attempt  to  calculate  more 
than  fourth  order  statistics  using  a  cell  structure  which  quantizes  each  variate 
into  eight  intervals.  Thus,  no  more  than  2^  =  4096  cells  would  be  involved. 

If  the  minimum  value  of  P  to  be  encountered  in  any  practical  application  is 
assumed  to  correspond  roughly  to  the  uniform  distribution  of  fourth  order  signal 

12  "12 

amplitudes  over  all  of  these  2  cells  (P  =  2  ),  and  if  a  relative  error  of  at 

-  12 

most  ten  percent  is  allowed  for  probabilities  down  to  2  ,  then  (63)  indicates 

12  19 

that  the  number  of  trials  should  be  on  the  order  of  2  •  100  s'  2  .  The  trial 

and  success  counters  in  the  success  counting  probability  analyzer  described  in 
Section  3,  have  been  designed  to  accommodate  this  number  of  samples  (or  fewer). 


2.  3.  3  Errors  Caused  By  Equipment  Inaccuracies 

The  two  major  types  of  errors  introduced  by  imperfect  implementation 
of  the  success  counting  method  are  (1)  warping  of  the  success  waveform,  and 
(2)  sampling  discrepancies  resulting  from  the  use  of  non-zero  width  sampling 
pulses.  Consider  fiist  the  sources  and  nature  of  the  first  type  of  error.  One 
source  of  error  in  producing  a  success  waveform  is  the  improper  location  of 
boundaries  of  the  given  region.  In  general,  the  error  in  estimation  of  a  pro¬ 
bability  which  results  from  improper  location  of  success  region  boundaries  is 
dependent  on  the  nature  of  the  probability  density  function  of  the  signals  involved. 
For  a  uniform  probability  density  function  defined  over  an  n-dimensional  cell 
structure  consisting  of  equivolume  cells,  specification  of  cell  boundaries  with 
a  percent  accuracy  results  in  a  possible  probability  density  estimation  error  of 
na  percent.  Fortunately,  even  for  the  most  fine-grained  quantization  desired 
(8  levels  per  dimension),  it  is  quite  easy  to  establish  a  1  percent  accuracy  of 
boundary  placement,  which  results  in  a  4  percent  error  in  fourth  order  probability 
density  function  estimation. 


Other  sources  of  success  waveform  distortion  are  switching  delayi  in 
the  Schmitt  triggers,  varying  delays  in  different  input  channels,  and  logic  delays. 
All  of  these  delays  serve  to  create  shifts,  compression,  or  expansion  of  the 
success  waveform.  These  effects  are  portrayed  in  Figure  9  for  a  sinewave, 
using  a  cell  width  equal  to  one-fourth  of  the  peak-to-peak  amplitude  of  that 
waveform.  The  ideal  success  waveform  consists  of  pulses  with  width  w,  spaced 
one  period  apart.  The  actual  success  waveform  consists  of  pulses  with  one  of 
two  widths,  Wj  and  w^»  spaced  approximately  one  period  apart.  Every  other 
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Figure  9.  Success 


Desired 

Distorted 

3 

Wave\ 
form  \ 

N 

i 

T1 

i 

^ - Waveform 

1 

for  a  Sinewave  Input 


pulse  has  width  w^#  corresponding  to  the  upward  movement  of  the  sinewave 
through  the  success  region,  and  these  pulses  are  interlaced  in  time  with  pulses 
with  width  w^,  corresponding  to  the  downward  movement  of  the  sinewave  through 
the  success  region.  If  many  samples  are  taken  at  a  rate  which  is  incommen¬ 
surate  with  the  frequency  of  the  sinewave,  then  translation  of  the  pulses  in  the 
success  waveform  will  not  affect  a  first-order  estimation  at  all.  For  higher 
order  statistics,  differences  in  translations  in  success  regions  for  different 
variates  may  tend  to  "smear”  a  probability  density  estimation.  However#  as 
long  as  these  differences  are  small  in  comparison  with  the  minimum  success 
waveform  pulse  width,  the  effect  on  an  estimate  is  negligible.  Calibration 
results  which  illustrate  this  effect  are  presented  in  Section  3. 


The  magnitude  of  the  difference  between  the  desired  result  and  the 
actual  result  obtained  for  the  probability  that  the  sinewave  falls  in  the  given 

region  is  J [w^  +  w^  ]  -  w  |  .  In  practice,  the  difference  in  switching  times 

of  different  Schmitt  triggers  will  be  the  major  cause  of  deviations  in  Wj  and  w^ 
from  w.  However,  for  signals  and  regions  for  which  roughly  as  many  upward 
traversals  are  produced  as  downward  traversals,  it  is  easily  seen  that  error 
cancellation  takes  place:  W|  +  w^  will  very  nearly  equal  2w.  This  cancellation 
effect  takes  place  whether  or  not  the  desired  success  waveform  pulse  widths 
are  fijcd.  Therefore,  the  error  introduced  by  distortion  in  the  success  wave¬ 
form  can  be  considered  insignificant.  This  conclusion  is  verified  by  calibration 
results  presented  in  Section  3. 


The  other  type  of  error  is  introduced  through  the  use  of  non-zero  width 
sampling  pulses.  F or  a  sampling  pulse  width  6,  and  a  given  width  of  a  success 
waveform  pulse,  w,  the  maximum  effective  change  in  the  success  waveform 

6  -  2,f-i 

produces  a  relative  error  of  — — - ,  where  M  is  the  minimum  overlap  between 

a  counting  pulse  and  a  success  waveform  pulse  which  will  be  counted  as  a  success. 
This  can  be  seen  by  noting  that  the  counting  pulse  can  be  located  at  any  point  in  an 
interval  of  duration  w  +  6  -  (2p)  to  cause  a  success  to  be  registered.  To  deter¬ 
mine  the  net  effect  of  this  counting  error  on  a  probability  estimate  would  require 
6  2  ju 

that  the  quantity  — — -  be  averaged  over  all  possible  values  of  success  wave¬ 

form  pulse  width,  w,  weighting  each  value  by  the  probability  of  that  value  occurring. 
Except  for  deterministic  signals,  this  problem  evidently  cannot  be  solved. 


Fortunately  this  calculation  can  be  circumvented  by  examining  two 
special  cases:  (1)  the  deterministic  signal  and  success  region  which  produces 
the  smallest  value  for  w  (and  therefore  the  largest  estimation  error),  and 
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(2)  a  random  process  with  known  characteristics.  In  each  case  we  want  to 
know  the  smallest  value  of  w  which  can  be  expected  to  occur. 


The  deterministic  signal  which  produces  the  shortest  success  wave¬ 
form  is  a  sinewave  with  the  maximum  frequency,  W,  for  which  it  is  desired  to 
obtain  probability  estimates.  In  this  study,  W  *  10  Kcps.  For  a  sinewave,  the 
minimum  success  waveform  pulse  width,  wo,  is  given  by  approximately 


w  * 

o 


1 

TTC  W 


TTC 


seconds 


(66) 


where  c  3  the  number  of  equi-amplitude  cells  covering  the  sinewave  amplitude 
range.  For  c  *  8,  this  minimum  pulse  width  is  wq  *  4  Msec.  Thus,  if  the 
sampling  pulse  width  6  ,  is  equal  to  2  M  sec,  and  the  minimum  overlap  between 
a  counting  and  success  waveform  pulse  which  will  produce  a  success  is  0.9  Msec, 
the  relative  error  in  estimation  will  be  at  most  5  percent.  It  should  be  pointed 
out  that  this  is  an  upper  bound  for  the  worst  combination  of  deterministic  signal 
and  success  region.  For  other  signals  and  success  regions,  the  relative  error 
should  be  less  than  that  implied  by  (66). 


To  check  this  result  for  a  specific  noise  waveform,  the  probability 
that  the  magnitude  of  the  slope  of  a  bandlimited  Gaussian  signal  will  be  greater 
than  a  given  value,  cr  ,  has  been  found 


Pr  {  j  slope  |  >  «  }  *  2  ft  |-  (67) 

2 

X  u 

r  i  2 

where  ®(x)  £  \  -yrr ■  e  du,  and  where  R  is  the  range  of  signalramplitudes 

V2tt 

-00 

R 

being  considered,  and  W  is  the  noise  bandwidth.  By  substituting  the  value  a  *  — 

'  cw 

in  (67),  the  probability  of  obtaining  a  success  waveform  pulse  width  less  than  w 
can  be  written*  Pr  {success  pulse  width  <  w) 


=  Z  ft 


ircwW 


(68) 


Assuming  that  the  noise  waveform  does  not  leave  the  success  region  at  the 
same  extremity  through  which  it  entered  the  region. 
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The  probability  that  a  aucceaa  waveform  pulae  will  be  generated  with  a  width 
leaa  than  the  minimum  pulae  width  of  a  determiniatic  aignal  (w  ■  w  ),  ia  there¬ 
fore  0 


Pr  {  aucceea  pulae  width  <  w^} 
■  2  ®  (-3) 


-  2.  6  x  10"3 


(69) 


Thus,  it  appears  unlikely  that  a  greater  counting  error  will  result  with  a  noisy 
waveform  than  is  obtained  with  the  high  frequency  sinew&ve.  Results  of  cali¬ 
bration  tests  using  a  sine  wave  are  reported  in  Section  3. 


2.3.4  Effects  of  Non-Stationarity  of  Signals 

A  few  comments  on  the  effects  of  non-stationarity  of  a  process  on  a 
probability  density  function  estimate  are  appropriate  at  this  point.  First,  for 
a  stationary  process,  as  pointed  out  in  the  earlier  sections,  the  accuracy  of 
p.  d.  f.  estimates  is  limited  primarily  by  the  processing  time,  improving  as  the 
processing  time  increases.  However,  for  a  non- stationary  process,  additional 
errors  can  result,  depending  on  the  rate  of  fluctuation  of  the  statistics.  There 
are  three  types  of  situations  which  can  arise:  where  the  time  constant  of  the 
statistical  (non- stationary)  fluctuations  is  small,  intermediate,  and  large, 
respectively,  compared  with  the  time  required  to  estimate  the  entire  p.  d.  f. 


For  the  first  case,  where  the  time  constant  is  small,  the  process  will 
pass  through  all  its  "modes"  many  times  during  the  processing  interval.  In 
this  case,  the  output  of  the  estimating  device  is  an  average  p.  d.  f.  of  the  actual 
input  process  p.  d.  f.  If  the  estimation  procedure  were  duplicated  on  a  different 
serial  section  of  input  data,  the  same  average  p.  d.  f,  would  result,  and  an 
observer  would  not  even  be  aware  of  the  rapid  statistical  changes  in  the  input 
process.  This  is  not  necessarily  a  deleterious  effect;  however,  it  is  well  to 
be  award  of  its  possible  presence  and  effect  on  shofrt  term  decisions. 


At  the  other  extreme  of  a  large  time  constant,  processing  of  the  input 
samples  is  completed  before  the  input  process  changes  its  statistical  behavior 
significantly.  (For  example,  the  power  in  the  input  process  may  slowly  increase 
and  double  its  value  in  an  hour. )  For  this  case,  the  estimation  device  yields  a 
local  estimate  of  the  true  input  p.  d.  f.  If  the  estimation  procedure  were  repeated 
on  a  serial  section  of  input  data,  gradual  changes  in  the  estimated  p.  d.  f.  would 
appear,  thereby  indicating  conclusively  a  non-stationary  trend  in  the  input  data. 
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For  intermediate  time  constant  values,  significant  input  statistical 
changes  occur  in  a  time  comparable  with  the  processing  time.  For  this 
situation,  a  sequential  search  of  p.  d.  f.space  can  yield  inconsistent  results 
(e.  g. ,  the  area  under  the  p.  d.  f.  not  equal  to  unity),  caused  by  a  11  jumping" 
about  of  the  input  process  in  signal  space.  The  best  way  to  alleviate  (but  not 
eliminate)  this  situation  is  to  record  a  section  of  input  data  which  is  long 
enough  to  accurately  evaluate  the  p.  d.  f.  in  one  cell  of  p.  d.  f.  space  and  rerun 
the  same  record  for  each  and  every  cell  of  p.  d.  f.  space  sequentially.  If  the 
time  constant  is  comparable  with  the  processing  time  for  one  cell  p.  d.  f. 
estimation,  there  will  still  be  significant  changes  in  the  total  estimated  p.d.f., 
record  to  record.  However,  local  estimates  of  the  p.d.f.  will  result  which 
will  be  consistent  (sum  up  to  unity)  and  will  indicate  accurately  the  degree  of 
non-stationarity  and  its  rate  of  change. 
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3.  A  FOURTH  ORDER  SUCCESS  COUNTING  PROBABILITY  ANALYZER 


3#  1  GENERAL  DESCRIPTION 

A  breadboard  of  a  success  counting  probability  analyser  has  been 
constructed  in  accordance  with  the  method*  described  in  Section  2.  2.  2.  2 
of  this  report.  A  photograph  of  the  unit  is  shown  in  Figure  10.  The  bread¬ 
board  is  basically  a  flexible,  digital  12-th  order  binary  probability  analyser. 
It  is  arranged  functionally  into  four  octal  channels,  making  it  a  fourth  order 
analyzer  with  3  bit  quantization  in  each  dimension. 


The  serial  success  counting  method  whose  mathematical  description 
is  given  by  (55),  is  used  for  measurement  of  Pr  {^s  c  ^  }  ,  with  twenty  bit 
trial  and  success  counters.  A  block  diagram  of  the  device  is  shown  in 
Figure  11.  Each  of  the  four  channels  is  provided  with  the  necessary  analog 
and  digital  circuits  required  for  establishing  the  success  region,Rx>^  , 
and  generating  the  success  waveform  f{s)  as  described  by  equation  (47).  An 
internal  clock  and  a  pulse  shaper  for  an  external  clock  are  provided  for 
sampling  of  the  success  waveform  at  frequencies  up  to  20  kc  with  sample 
sizes  of  2®  through  2*9  selectable  in  12  steps. 


The  ratio  of  the  numbers  in  the  success  and  trial  counters  is 
normalized  and  displayed  in  a  row  of  lights  as  a  6  bit  binary  mantissa  and 
a  4  bit  binary  (1 1  s  complement  form)  characteristic.  The  output  is  also 
automatically  punched  as  two  6  bit  characters  on  paper  tape  in  a  format 
compatible  with  CDC  160  computer  (the  extra  two  bits  are  used  as  a  control 
code). 


The  size  of  the  success  region  Rx  ^  for  each  component  of  s  may 

be  set  (independently  for  each  channel)  toT.  25,  2,  5,  or  5  volts,  and  the  cen¬ 
ter  of  the  success  region  may  be  stepped  through  8,  4,  or  2  equally  spaced 
intervals  in  each  channel  (step  size  independently  selected  for  each  channel). 
The  cell  location  (center  of  the  region  ^)  may  be  stepped  either  manually 
or  automatically  to  a  new  location  at  the  end  of  each  measurement.  The 
entire  4096  cell  space  or  selected  regions  may  be  covered  automatically  by 
appropriate  settings  of  front  panel  controls. 


a 


See  Figure  6  for  a  Generic  Block  Diagram  of  Method 
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Probability  Analyser  Block  Diagram 


As  indicated  in  Figure  12 ,  two  logical  AND  gates*  A  and  B,  are 
provided  with  means  for  switching  any  combination  of  the  four  success 
waveforms  independently  to  either  AND  gate.  Gate  A  is  sampled  by  the 
clock  pulses  and  Gate  B  may  be  sampled  by  either  success  pulses  from 
the  output  of  Gate  A  or  delayed  success  pulses  from  Gate  A.  The  trial 
counter  may  be  switched  to  count  either  clock  pulses  or  Gate  A  success 
pulses.  The  success  counter  may  be  connected  to  the  output  of  either  Gate 
A  or  Gate  B.  Thus  the  analyzer  may  be  set  up  to  measure  1)  the  probability 
of  a  success  at  Gate  A,  2)  the  probability  of  a  success  at  Gate  B,  given  a 
success  at  Gate  A,  3)  the  probability  of  a  success  at  Gate  A  and  at  a  later 
time  a  success  at  Gate  B  or  4)  the  probability  of  a  success  at  Gate  B  given 
that  a  success  occurred  at  some  fixed  earlier  time  at  Gate  A,  where  success 
at  Gate  A  or  Gate  B  may  independently  indicate  the  joint  occurrence  of  up 
to  four  input  signals  in  corresponding  independently  set  regions. 


3.  2  DETAILED  DESCRIPTION  OF  EQUIPMENT 
3.  2.  1  Basic  Units  of  the  Analyzer 

The  block  diagram  of  Figure  11  shows  the  major  units  of  the  bread* 
board.  The  function  of  the  Success  Region  detector  is  performed  by  the 
Cell  Location  Counter  and  the  Success  Indicator.  The  Arithmetic  Unit 
includes  the  Trial  and  Success  Counters  plus  the  control  logic  required  to: 
a)  set  the  sample  sizef  b)  normalize  the  result,  c)  initiate  the  punch  cycle 
and  d)  automatically  advance  the  cell  location  counter  at  the  end  of  each 
measurement  cycle. 


3.2.2  Success  Indicator 


A  Simplified  Block  Diagram  of  the  Success  Indicator  is  shown  in 
Figure  12.  A  success  waveform  is  generated  for  each  channel  by  the  use 
of  two  hyperplane  circuits  (comparators),  a  digitally  controlled  reference 
voltage,  and  two  offset  voltages  +  —  and  *  •  One  comparator  has  a 


"l"  output  when  the  input  signal  is  less  than  the  reference  voltage  plus  A/2 
and  the  other  has  a  "1"  output  when  the  input  signal  is  greater  than  the 
reference  voltage  minus  A/2.  If  both  comparator  outputs  are  Ml"  then  the 
success  waveform  f(s)  is  "1"  and  s  is  within  the  selected  cell  (or  success 
region)  in  that  channel.  The  binary  input  which  controls  the  reference  voltage 
comes  from  the  Cell  Location  Counter  (see  Section  3.2.6).  Both  the  cell 
size,  A,  and  the  cell  center  in  each  channel  can  be  set  with  an  accuracy  of 
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Figure  12.  Simplified  Diagram  of  Success  Indicator 


better  than  +  0.  5  percent  with  a  long  term  stability  (several  days)  of 
better  than  0,  5  percent.  The  effective  cell  size  accuracy  is  degraded 
near  the  upper  frequency  limit  of  the  analyzer,  to  a  worst  case  value  of 
10  percent  at  10  kc. 


A  fourth  position  of  the  cell  size  control  sets  A  •  0  and  allows 
for  selection  of  either  comparator  so  that  the  success  waveform  indicates 
success  for  input  signals  greater  than  (or  less  than)  the  reference  input. 
This  position  may  be  used  for  measurement  of  distribution  functions. 


Two  independent  l-st  to  4-th  order  success  waveforms  may  be 
obtained  by  switching  any  combination  of  the  four  first  order  success  wave¬ 
forms  to  either  or  both  of  two  logical  AND  gates:  Success  Gate  A  and 
Success  Gate  B.  The  Gate  Input  Selector  Switches  may  be  set  independently 
for  each  gate.  The  clock  pulses  are  applied  as  a  5-th  input  to  Gate  A  as 
well  as  to  the  Mode  Selector  and  Control  Logic.  The  modified  clock  pulses, 
applied  to  the  5-th  input  to  Gate  B,  may  be  either  the  output  pulses  directly 
from  Gate  A  or  delayed  Gate  A  output  pulses.  The  Conditioned  Sample 
Pulses  are  obtained  either  from  the  clock  directly  or  from  Gate  A.  The 
Success  Pulses  are  obtained  from  either  Gate  A  or  Gate  B.  The  mode 
selector  is  therefore  able  to  set  up  the  analyzer  to  measure  any  one  of  four 
quantities: 

1)  P(A):  the  joint  probability  that  the  signals  at  the  inputs  of  all 
channels  whose  outputs  are  connected  to  Gate  A  will  fall  within  the  region 
selected  for  their  respective  channels. 

2)  P(B/A):  the  conditional  probability  of  a  success  at  Gate  B  * 
given  a  success  at  Gate  A,  where  success  at  Gate  A  or  Gate  B  is  defined 
as  in  Mode  1. 

3)  P(Bf  A  delayed):  the  joint  probability  of  a  success  at  Gate  A 
and  at  some  specified  time  later  a  success  at  Gate  B, 

4)  P(B/A  delayed):  the  conditional  probability  of  a  success  at 
Gate  B  given  a  success  at  some  specified  earlier  time  at  Gate  A. 


A  fifth  position  of  the  mode  selector  is  made  available  for  future 
expandability  of  the  breadboard,  namely:  P(A  or  B). 
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3.  2*  3  Arithmetic  Unit 


The  Success  Indicator  outputs  (i.  e. *  the  conditioned  sample  pulses 
and  success  pulses)  are  fed  to  the  Arithmetic  Unit  where  they  are  counted 
in  two  20  bit  countersi  the  Trial  Counter  and  Success  Counter  respectively* 
Control  circuits  are  provided  to  allow  sampling  to  continue  until  the  Trial 
counter  has  reached  a  selected  (by  the  Sample  Size  Selector  Switch)  integral 
power  of  2  between  2®  and  2^#  at  which  time  the  counting  is  inhibited* 

The  completion  of  sampling  is  detected  by  the  presence  of  a  ,J1M  in  the 
selected  (control)  bit  of  the  trial  counter  (and  M0's"  in  all  other  bits)*  The 
Sample  Size  Selector  Switch  also  selects  the  Success  Counter  bit  corresponding 
to  the  selected  Trial  Counter  Control  bit  plus  the  five  next  less  significant 
bits  a 8  the  probability  output. 


On  completion  of  the  sampling  process*  the  contents  of  the  success 
counter  are  normalized  by  shifting  until  the  most  significant  output  bit  is  "l", 
or  for  a  maximum  of  15  shift  pulses.  The  shift  pulses  are  counted  in  the 
least  significant  four  bits  of  the  Trial  Counter  (which  are  all  zero  on  comple- 
tion  of  the  sampling  interval)*  which  therefore  contain  the  characteristic  of 
the  probability.  The  six  selected  bits  of  the  success  counter  contain  the 
Mantissa.  On  completion  of  the  normalization  process*  a  pulse  is  trans¬ 
mitted  to  the  punch  logic  to  initiate  the  punch  cycle  and  to  the  cell  location 
counter  to  advance  it  to  the  next  cell  location. 


3.  2.  4  Cell  Location  Counter 


The  Cell  Location  Counter  consists  of  a  12  bit  counter  arranged  as 
four  octal  counters*  one  for  each  Success  Indicator  Channel*  Each  octal  unit 
functions  as  a  separate  counter  with  the  overflow  from  the  channel  1  counter 
indexing  the  channel  2  counter*  overflow  from  channel  2  indexing  channel  3  and 
so  on  through  channel  4.  Overflow  from  channel  4  stops  the  automatic  re¬ 
cycling  of  the  Arithmetic  Unit.  Each  octal  unit  has  separate  front  panel 
control  of  both  initial  and  final  locations*  and  would  be  reset  to  the  initial 
location  at  the  start  of  a  computation*  count  up  to  the  final  location*  pro¬ 
duce  an  overflow  pulse*  reset  itself  to  the  initial  location,  and  repeat  the 
process  until  overflow  from  the  channel  4  counter  stops  automatic  recycling 
of  the  arithmetic  unit.  A  two  bit  location  code  is  generated  to  indicate  whether 
the  last  advance  pulse  produces  overflow  from  the  first*  second*  or  third 
octal  unit  or  no  overflow. 
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The  Step  Size  may  be  set  (independently  for  each  channel^)  to  step 
through  1/8,  1/4  or  l/2  of  the  total  range  on  each  index  (or  overflow)  pulse. 
The  output  of  each  octal  unit  Is  connected  to  the  most  significant  3  bits  of 
the  corresponding  (4-bit)  digital-to-analog  converter  in  the  success 
indicator.  The  least  significant  bit  is  controlled  by  a  toggle  switch  (next 
to  the  step  size  switch^,  and  is  used  to  shift  the  cell  center  to  the  upper 
edge  of  one  of  the  eight  equal  voltage  intervals,  when  the  step  size  is 
greater  than  1/8. 


The  reason  for  the  flexibility  designed  into  the  Cell  Location 
Counter  is  to  allow  for  automatic  stepping  through  selected  limited  four 
dimensional  signal  space  to  explore,  in  more  detail,  regions  of  particular 
interest. 


3.2.5  Punch  Logic  and  Data  Format 

The  punch  logic  serves  to  commutate  the  data  into  the  paper  tape 
perforator  and  generate  the  necessary  timing  pulses  to  punch  two  7  bit 
characters  for  each  measurement  and  a  pulse  to  recycle  the  Arithmetic 
Unit. 


The  output  data  is  arranged  as  a  12  bit  computer  word  compatible 
with  the  CDC  160  computer.  The  least  significant  six  bits  are  the  Mantissa 
and  are  obtained  from  the  Sample  Size  Switch  output.  The  next  four  bits  are 
the  one’s  complement  of  the  characteristic.  The  complement  form  is  used 
since  the  characteristic  is  always  negative.  By  making  the  characteristic 
the  more  significant  part  of  the  word  (than  the  Mantissa)  and  using  its  complex 
ment  form,  a  monotonic  relationship  between  the  probability  and  its  binary 
representation  is  maintained,  thus  simplifying  certain  computer  processing 
of  a  probability  density  function.  The  most  significant  two  bits  are  the  location 
code  generated  in  the  Cell  Location  Counter  and  will  be  useful  in  checking  the 
data  while  entering  it  in  the  computer. 


The  computer  word  is  punched  as  two  characters,  the  most  significant 
half  being  punched  first  and  indicated  by  a  "1”  in  the  seventh  hole  position.  A 
typical  tape  for  a  Sine  Wave,  first  order  density  function  is  shown  in  the 
upper  right  of  Figure  13.  The  numbers  punched  are  indicated  in  a  convenient 
octal  form  as  shown  to  the  right  of  the  tape.  Four  octal  characters  are  used 
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INPUT  BIN  ANY 

VOLTAGE  CELL 

LOCATION 

5  Volts  - 


Graphical  Calculation  of  Sine  Wave  Density  Function 


'  < .‘Vi  lS 


I 

I 

f  for  two  punched  characters  or  one  computer  word.  This  notation  is 

1  convenient  since  the  data  can  be  easily  typed  out  in  this  form  using 

standard  programs.  Only  the  least  significant  bit  of  the  most  significant 
j  octal  digit  is  part  of  the  characteristic  and  so  this  digit  should  be  con-* 

*  sidered  as  1  for  odd  numbers  (1,  3,  5,  or  1)  and  0  for  even  numbers  (0,  2,  4,  6). 

Since  the  most  significant  2  bits  of  this  octal  digit  represent  the  location 
\  code,  this  code  can  be  obtained  by  subtracting  1  from  odd  numbers  and  0 

from  even  numbers  and  dividing  by  2. 


3.  2.  6  Use  of  Paper  Tape  Output 

For  convenience  in  processing  the  output  of  the  probability  analyser 
in  a  digital  computer,  a  paper  tape  punch  has  been  provided  with  the  bread¬ 
board,  The  format  of  punched  data  has  been  chosen  for  convenience  in  using 
the  CDC  160  computer,  as  discussed  in  Section  3,2.  5, 


Even  though  the  Probability  Analyzer  is  a  versatile  device  which 
can  provide  probability  estimates  directly  at  its  output,  problems  may  arise 
for  which  further  processing  is  required.  Two  types  of  analyses  which  are 
especially  useful  for  evaluating  the  characteristics  of  n-order  statistics 
(after  the  n-dimensional  probability  function  has  been  calculated)  are  the 
examination  of  two-dimensional  cross  sections,  and  identification  of  the 
modes  (high  density  regions)  of  a  probability  density  function.  A  third 
function  which  might  be  performed  with  a  computer  is  computation  of 
Pr  {  s  c  R  },  where  R  is  a  complicated  region  not  easily  instrumented  in  the 
Success  Region  Detector,  and  where  the  probability  density  function,  pn(x^) 
has  been  calculated  in  the  Probability  Analyzer. 


To  illustrate  the  manner  in  which  operations  such  as  these  might  be 

performed  using  the  paper  tape  output  of  the  Probability  Analyzer,  flow 

charts  of  programs  for  (a)  inserting  data  into  the  computer,  and  fb)  computing 

and  displaying  two  dimensional  cross  sections  of  joint  probability  density 

functions,  £  (x)  (all  but  two  of  the  {xj}  fixed),  and  conditional  probability 

densities,  p  \x.  jail  x  ,  i  fi  j),  have  been  included  in  Appendix  1. 
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3.  3  CALIBRATION  DATA 

3.  3. 1  D.  C.  Calibration  Data 


A  simplified  block  diagram  of  the  comparator  circuits,  which 
determine  the  cell  size  and  cell  location,  is  shown  in  Figure  14.  The 
data  for  the  final  DC  calibration  of  these  circuits  (taken  about  1  week  pHor 
to  delivery),  is  shown  in  Tables  2,  3,  and  4.  The  cell  location  calibration 
(Table  2}  was  made  by  measuring  the  input  signal  (from  a  precision  voltage 
reference  source)  required  to  set  the  output  of  the  summing  amplifier  to 
zero  volts,  for  each  binary  input.  Data  for  the  two  extremes  and  the  center 
point  are  presented.  The  worst  case  error  for  all  points  of  all  four  channels 
was  less  than  0. 1  percent  of  full  scale.  The  data  in  Table  2  were  rechecked 
after  delivery  to  RADC  and  all  points  were  within  10  mv  without  further  re¬ 
adjustment. 


The  cell  size  data  of  Table  3  was  obtained  by  adding  a  small 
(50  mv  p-p)  low  frequency  sinewave  to  the  DC  input,  setting  the  Binary  input 
to  1000  (see  Table  2)  and  measuring  the  positive  and  negative  input  voltages 
required  to  produce  a  square  success  waveform,  f(s).  The  cell  size  indicated 
in  Table  3  are  all  within  1  percent  of  their  nominal  value.  The  values  for  the 
1/8  cell  size  were  rechecked  after  delivery  to  RADC  and  were  all  still  within 
the  1  percent  tolerance  and  had  not  changed  by  more  than  5  mv  from  the 
initial  calibration  points. 


The  Schmitt  Trigger  hysteresis  (Table  4)  was  also  rechecked  at 
RADC  and  all  values  were  again  within  5  mv  of  the  initial  calibration. 


3.  3.  2  Low  Frequency  Sine  Wave  Calibration 

A  graphical  method  of  calculating  the  histogram  of  the  first  order 
probability  density  function  of  a  sinewave  using  3  bit  (8  level)  quantization  is 
shown  in  Figure  13.  The  cell  numbers,  binary  representation  of  the  cell 
centers,  and  voltages  of  the  cell  boundaries  are  shown  with  the  sinewave  in 
the  upper  left  corner.  The  corresponding  success  waveforms  f(s)  are 
shown  (for  each  cell)  below  the  sinewave.  The  probability  for  each  of  the 
eight  cells  is  shown  in  the  table  to  the  right  of  the  success  waveform  in 
decimal,  binary,  and  octal  form.  The  binary  and  octal  forms  are  in  the 
normalized  form  available  as  the  analyzer  output,  with  the  complement  of  the 
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Figure  14.  Simplified  Diagram  of  Comparator  C: 


characteristic  indicated.  The  x'e  in  the  binary  form  are  location  code 
bite  and  may  take  on  any  value,  depending  on  the  channel  being  ueed. 
and  the  cell  being  measured  as  well  ae  the  setting  of  certain  switches. 

A  paper  tape  representation  of  a  typical  sinewave  density  function  is  shown 
above  the  table. 


For  calibration  of  the  analog  circuits  the  density  function  of  a  low 
frequency  einewave  was  determined  by  measuring  the  duty  cycle  of  the 
success  waveform,  f(s).  Table  5  lists  these  results  for  all  four  channels 
with  a  100  cps  sinewave.  These  numbers  agree  with  the  calculated  values 
within  the  accuracy  (a  few  percent)  with  which  the  scope  could  be  inter¬ 
preted.  A  copy  of  a  paper  tape  output  from  the  Probability  Analyser 
automatic  measurement  of  the  data  in  Table  5  is  shown  in  Figure  15.  The 
Cell  numbers  and  octal  representation  of  the  punched  data  are  shown  next 
to  the  tape.  Note  that  the  most  significant  octal  characters  are  different 
for  each  channel.  This  difference  is  the  result  of  the  location  code  bits, 
which  are  different  for  each  of  the  measurements.  The  results  on  this  tape 
are  all  within  1  bit  (about  2  percent)  of  the  calculated  value  (see  sample 
tape  in  Figure  13. 


3.  3.  3  High  Frequency  Calibration  Using  a  10  kc  Sinewave 

Table  6  gives  the  results  of  the  analog  measurement  and  Table  7 
the  corresponding  digital  measurement  of  the  density  function  of  a  10  kc 
sinewave.  These  results  were  obtained  at  RADC  after  delivery  d  the  equip¬ 
ment.  The  worst  case  error  was  about  10  percent,  however  the  digital  and 
analog  measurements  agree  within  a  few  percent.  Slightly  better^ »e suits 
were  obtained  previously  at  Waltham.  Some  of  the  errors  in  the  measure¬ 
ments  given  in  Tables  6  and  7  may  be  due  to  the  50  mv  noise  level  (about 
4  percent  of  the  cell  size)  of  the  sinewave  generator.  As  a  result  of  the  noise 
level,  the  sinewave  had  to  be  set  slightly  less  than  10  v  pp  in  order  to  contain 
the  whole  density  function  within  the  Jb  5v  range  of  the  instrument.  Because 
of  time  limitations  and  the  relative  significance  of  the  errors  the  exact 
cause  was  not  tracked  down. 
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£(*)  for  a  100  cp»  Sine  Wave 


I 

I 

r 


! 

i 


f 


v 
SG 1 
jS  "• 

3  . 

«i  (ii, 

o 


rO 

pH 

vO 

pH 

ro 

o 

ro 

sO 

o 

o 

m 

o 

to 

O 

00 

00 

00 

00 

o 

ro 

o 

o 

o 

o 

p-4 

CM 

o 


o 

O 

O' 

CM 

o 

vO 

M* 

in 

(M 

pH 

00 

00 

00 

00 

o 

cn 

CM 

• 

o 

• 

o 

o 

• 

o 

s 

• 

IM 

• 

<*> 

pH 

V 

a 

J 

o 


M 

X 

a 

g 


in 

pH 

O' 

ro 

in 

o 

o 

oo 

«4H 

O 

§ 

pH 

pH 

00 

00 

oo 

O' 

pH 

CM 

N 

pH 

o 

o 

o 

o 

H 

M 

ti 

• 

■ 

• 

• 

• 

• 

• 

• 

o 

ro 

vO 

o 

o 

Is- 

m 

m 

pH 

o 

00 

00 

00 

00 

o 

m 

N 

pH 

o 

o 

o 

o 

pH 

CM 

o 

ro 

NO 

N 

o 

t** 

NO 

o 

N 

O 

oo 

00 

00 

00 

o 

ro 

CM 

• 

o 

* 

o 

• 

o 

* 

o 

• 

pH 

s 

CM 

• 

a 

r- 4 

(3 

>S 

a 

i 

u 

Ua 

W) 

a 

o 

H 

4 

H 


h  m  m  ^  in 


69-  • 


Measuring  the  Duty  Cycle  of  the  Success  Waveform 
for  a  10  Kc  Sine  Wave 
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Figure  15.  Low  Frequency  Sinewave  Calibration  of  the  Four  Channels 
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3.  3. 4  Nol»e  Measurement 


The  density  function  of  a  noiee  generator  output  filtered  by  a 
eingle  pole  5  kc  lowpaae  filter  was  measured  for  each  channel  with  the 
results  shown  in  Table  8.  The  density  function  for  the  "Typical  Channel" 
was  obtained  by  taking  the  mean  or  in  a  few  oases  the  mode  of  the 
four  channels.  The  decimal  equivalent  of  the  histogram  for  the  typical 
channel  lie  given  as  well  as  a  calculated  histogram.  The  basis  for  the 
calculations  was  a  Gaussian  distribution  with  zero  mean  and  a  value  of 
a  chosen  for  best  fitting  of  the  typical  channel  curve  (6. 64crm  10  v).  The 
fit  is  good  near  the  center  cells  and  gets  progressively  worse  at  the  out¬ 
side  cells.  The  +  5  percent  asymmetry  of  the  curve  suggests  the  possibi¬ 
lity  of  non- symmetrical  limiting  which  could  occur  in  the  analyser  due  to 
differences  in  the  recovery  time  from  positive  and  negative  overloads. 
This  source  of  distortion  could  be  investigated  (and  removed)  by  limiting 
the  input  signal  (using  fast  recovery  diode  limiters)  to  the*  5  volt  range  of 
analyzer.  The  typical  and  calculated  histogram  from  Table  8  are  plotted 
in  Figure  16  for  comparison. 


As  a  check  on  the  higher  order  capabilities  of  the  analyzer  the 
joint  probability  of  a  success  at  Gate  A  and  at  about  1  ms  later  a  success 
at  Gate  B  was  measured  where  channel  2  was  connected  to  Gate  A  and 
channel  3  connected  to  Gate  B  and  the  previous  noise  signal  connected  to 
both  channels  2  and  3.  With  binary  inputs  to  both  channels  set  to  cell  3. 
the  measured  probability  was.  in  octal  form:  1355  which  is  exactly  the 
product  of  the  probabilities  given  for  the  corresponding  cell  and  channels 
in  Table  8. 


3. 4  EXPANSION  CAPABILITIES  OF  THE  BREADBOARD 
3. 4.  1  Modifications  of  the  Analog  Circuits 

While  the  breadboard  was  designed  to  be  a  fairly  versatile  general 
purpose  probability  analyzer,  some  limitations  had  to  be  imposed  by  cost 
considerations.  The  limitations,  however,  are  primarily  in  the  analog 
circuitry.  Numerous  simple  modifications  to  these  circuits  could  be  made 
to  extend  their  usefulness  in  the  solution  of  specific  problems.  Incorporation 
of  thfese  modifications  did  not  seem  practical  since  their  particular  con¬ 
figuration  would  be  a  function  of  the  particular  problem,  and  they  might  better 
be  made  as  temporary  wiring  changes  than  as  unnecessarily  complicated  panel 
controls. 
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Figure  16.  Histogram  for  Gaussian  Noise  Waveform 
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Ob*  such  modification  would  b*  provision  for  th*  g«n*ratlon  of 
irregularly  shaped  success  regions.  Since  there  are  presently  eight 
hyperplane  circuits  in  the  equipment  with  convenient  means  for  connecting 
these  circuit  outputs  to  the  success  gates,  only  minor  rewiring  would  be 
necessary  to  provide  for  convenient  insertion  of  additional  summing 
resistors. 


A  second  modlfic  ation  would  be  to  provide  additional  limiting 
circuits  so  that  the  signals  could  be  amplified  further  and  examined  in  more 
detail  in  narrower  amplitude  ranges  of  interest. 


3. 4. 2  Use  of  a  Digital  Comparator 

As  mentioned  in  the  general  description,  the  breadboard  is  a  12*th 
order  binary  or  4-th  order  octal  probability  analyser.  A  considerable 
increase  in  its  flexibility  could  be  obtained  by  providing  for  the  use  of 
digital  comparators  as  well  as  the  present  analog  comparators.  This 
modification  could  be  accomplished  very  easily  with  the  addition  of  a  small 
amount  of  additional  wiring  in  such  a  way  that  either  analog  or  digital 
comparators  could  be  plugged  into  the  same  connector,  and  the  switching 
would  be  automatic. 


With  this  modification,  any  binary  input  could  be  applied  to  the 
analyser  and  compared  with  the  cell  location  counter  setting.  The  success 
waveform  would  thus  be  "1"  when  both  binary  numbers  are  the  same  and 
sero  otherwise. 


For  example  a  12  bit  D-A  converter  could  be  connected  between 
the  signal  and  the  analyser,  and  a  high  resolution  first  order  density  function 
measured  automatically  with  each  conversion  representing  one  sample. 

With  two  6  bit  D-A  converters  a  higher  resolution  second  order  density 
function  could  be  measured. 


If  a  set  of  signals  were  A-D  converted  (using  any  combination  of 
A-D  converters  with  up  to  12  bits  output}  and  the  resulting  set  of  numbers 
recorded  on  a  magnetic  tape  loop,  then  the  tape  output  could  be  compared 
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with  the  Call  Location  Counter  and  a  corresponding  density  function 
calculated.  This  option  would  have  the  advantage  of  eliminating  the 
variation  of  signal  characteristics  with  time  which  might  be  inherent 
in  an  analog  recording* 


For  pattern  recognition  problems  It  is  often  desirable  to  analyse 
some  analog  waveforms  along  with  the  outputs  of  parameter  extractors 
which  may  have  digital  outputs.  For  such  problsms  it  would  be  convenient 
to  use  one  or  more  of  the  analog  inputs  to  the  analyser  and  at  the  same 
time  use  digital  comparators  in  the  other  inputs.  With  modifications  cited 
above,  such  problems  can  be  solved  readily. 


4.  CONCLUSIONS  AND  RECOMMENDATIONS 


For  estimating  n-th  order  probabilities  of  the  form«  Pr  {  s,  «  R), 
where  n  >  1,  the  success  counting  method  of  estimation  is  the  most  versatile 
and  accurate  of  the  various  methods  examined.  If  extremely  low  probabilities 
are  to  be  calculated  (as  when  investigating  "tails"  of  distribution  functions)! 
the  sampling  and  counting  technique  utilising  digital  circuitry  is  preferable 
to  the  analog  technique  of  integrating  under  a  success  waveform. 


The  choice  between  parallel  processing  of  samples  utilising  a  general 
purpose  computer  and  serial  processing  with  a  special  purpose  device  hinges 
on  (1)  the  availability  of  a  computer,  (2)  the  capability  to  feed  data  directly 
into  the  computer  without  intermediate  storage,  and  (3)  whether  many  small- 
scale  probability  analyses  will  be  performed,  or  relatively  few  large-scale 
analyses.  Consideration  of  (2)  and  (3)  has  led  to  implementation  of  the  serial 
processing  method  on  this  project. 


Salient  features  of  the  serial  processing  breadboard  are: 

(1)  A  rough  estimate  of  an  entire  probability  density  function  can  be 
obtained  quickly  through  the  use  of  a  coarse  histogram  cell  structure. 

(2)  Partial  results  (e.  g. ,  a  portion  of  a  probability  density  function) 
can  be  obtained  quickly  without  having  to  go  through  a  complete  calculation. 

(3)  Answers  to  practical  problems  of  the  form  Pr  {  aa  R}  ,  where  R 
may  be  a  complicated  region,  can  be  obtained  directly  without  going  through 
the  intermediate  step  of  calculating  Pn(x,)> 

(4)  Accuracy  requirements  can  be  translated  to  processing  time  with 
a  fairly  high  degree  of  confidence. 

(5)  Expansion  or  modification  of  the  device  to  accommodate  special 
requirements  is  readily  implementable. 


A  device  of  the  sort  described  in  Section  3  will  provide  answers  to  a 
large  variety  of  statistical  questions,  if  suitable  modifications  are  made  in  file 
success  region  detector.  Since  these  modifications  will  in  most  cases  be  ex¬ 
tremely  inexpensive,  but  cannot  be  foreseen  for  all  problems,  it  is  Suggested 
that  this  breadboard  be  regarded  and  utilised  as  a  general  purpose  digital 
probability  analyser,  taking  full  advantage  of  the  provision  for  changing  the  success 
region  detector. 
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APPENDIX  I .  COMPUTER  PROGRAM  FLOW  CHARTS 


I.  1  The  following  is  a  description  of  a  method  (Figure  I  -1)  for  storing 

the  data  output  from  the  Probability  Analyzer  into  a  computer  in  the  same 
arrangement  that  it  held  in  the  Probability  Analyser. 


Each  word  in  the  computer  is  initialised  to  a  code  to  indicate,  at  the 
program's  completion,  the  locations  which  do  not  contain  pertinent  information. 
Control  data  consists  of  two  four-digit  octal  location  readings  (the  initial  and 
final  locations  for  the  data)  and  a  four-digit  switch  setting.  Each  digit  in  the 
latter  is  used  to  step  the  corresponding  digit  position  in  the  read  location  during 
program  execution.  An  overflow  condition  is  generated  when  any  digit  position 
in  the  current  read  location  exceeds  its  corresponding  digit  in  the  final  location. 
The  type  overflow  to  be  generated  can  be  computed  and  compared  to  the  overflow 
indicator  on  the  punched  tape  input  to  verify  the  data  transmission. 


Explanation  of  tape  input  (i.  e. ,  control  data) 

Start  Location  **4  D3 

Final  Location  S  .  S,  S_  S, 

4  3  Z  1 

Switch  Settings  SW4  SW}  SW2  SWt  [Possible  settings,  1,  2,4] 

Overflow  Codes  01  into 

10  into 

1 1  into  D 

4 

The  D's  are  the  octal  digits  of  the  initial  read  location;  S's  are  the 
final  settings;  SW's  are  the  increments  for  stepping  through  each  of  the  digit 
positions. 


Ex.  Start  2023 
Stop  4065 
Switches  2141 
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Initialise 
Uaehiaa  to 
"Codas" 


|  Sat  Initial  Read 
Location;  Compute 
|  Count  on  Satisfying 
Units  Position  for 
[  Storing 


Determine 

Overflow 

Signal 


Units  Position 
Satisfied? 


|  Yes 


Increment  Road 
Location  by 
Sense  Switch. 


End  of  Data  ? 


~pjo~ 


Does  Overflow  Signal 
from  Tape  Match  Com¬ 
puted  Overflow? 


Reset  First,  Second,  /f 
Third  Positions  in  Read 
Location  to  Original 
Settings.  Update 
Fourth  Position  by 
Sense  Switch^. 


Signal  ■  10 

Reset  First  $  Second 
Positions  in  Read  Loca¬ 
tion  to  Original  Settings^ 
Update  Third  Position 
by  Sense  Switch^. 


Reset  First  Position  in 
Read  Location  to  Ori¬ 
ginal  Reading.  Update 
Second  Position  by 
Sense  Switcl^. 


Figure  1-1.  Data  Input  Program 
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Read  Date  Into  2023 

2024 

2025  01  overflow 

2063 

2064 

2065  11  overflow 

4023 

4024 

4025  01  overflow 

4063 

4064 

4065 

End  of  Data 


"Codes"  *  any  signal  to  indicate  locations  which  do  not  contain 
pertinent  data  after  the  ewcution  of  the  program. 


A  method  for  determining  what  type  overflow  signal  will  be  found: 


Count  for  Digitj 


(Digitj  in  Original  Loc.  -  Digit  j  in  Final) 
Switch  Setting^ 


♦  1 
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1.2  Figure  1-2.  if  the  flow  chart  of  a  technique  which  can  be  used  in 
calculating  two-dimensional  cross  sections  of  n-th  order  probability  density 
functions*  P  (x., . . . ,  x  )  for  1  <  n  <  4.  The  equations  below  apply  to  said 
flowchart. 


ffcjj)  ■  Pj^.  •  •  •  xji»  aj+i<  •  •  •  *  *n) 

where  the  ^  ■  (a^  . . . ,  aJ+1, . . . ,  aN) 

are  specified  quantities. 

g(Xji>  -  P^Xjjl  a(  )  -  - - 

1  f(xJS> 

S-l 

where  I  ■  1,  2, ,  c  and  1  <  c  <  8. 


1. 3  In  order  to  plot  an  array  of  two  dimensiona  1  data  on  a  typewriter,  the 
data  should  be  ordered  on  die  independent  variable.  The  range  of  the  dependent 
variable  is  scaled  across  a  given  number  of  type  positions.  If  it  is  desired  to 
scale  the  independent  variable,  the  range  is  based  on  the  number  of  lines  per 
page.  Each  parameter  is  scaled  over  the  required  range,  the  number  of  spaces 
or  carriage  returns  is  computed  and  a  decimal  point  typed  in  the  appropriate 
position. 
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Figure  1-2.  Calculation  of  2 -Dimensional  Cross  Sections  of  n-th 

Order  Probability  Density  Functions 
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Figure  1-3.  To  Plot  Aa  Array  of  Two  Dimoaaional  Data 

Note:  Independent  Variablo  mast  bo  Ordorod 


